Dataset statistics
| Number of variables | 28 |
|---|---|
| Number of observations | 5043 |
| Missing cells | 2695 |
| Missing cells (%) | 1.9% |
| Duplicate rows | 33 |
| Duplicate rows (%) | 0.7% |
| Total size in memory | 4.9 MiB |
| Average record size in memory | 1020.3 B |
Variable types
| Categorical | 11 |
|---|---|
| Numeric | 16 |
| URL | 1 |
| Dataset has 33 (0.7%) duplicate rows | Duplicates |
director_name has a high cardinality: 2399 distinct values | High cardinality |
actor_2_name has a high cardinality: 3032 distinct values | High cardinality |
genres has a high cardinality: 914 distinct values | High cardinality |
actor_1_name has a high cardinality: 2097 distinct values | High cardinality |
movie_title has a high cardinality: 4917 distinct values | High cardinality |
actor_3_name has a high cardinality: 3521 distinct values | High cardinality |
plot_keywords has a high cardinality: 4760 distinct values | High cardinality |
country has a high cardinality: 63 distinct values | High cardinality |
num_critic_for_reviews is highly correlated with num_voted_users and 2 other fields | High correlation |
actor_3_facebook_likes is highly correlated with actor_2_facebook_likes | High correlation |
actor_1_facebook_likes is highly correlated with cast_total_facebook_likes | High correlation |
gross is highly correlated with num_voted_users and 1 other fields | High correlation |
num_voted_users is highly correlated with num_critic_for_reviews and 3 other fields | High correlation |
cast_total_facebook_likes is highly correlated with actor_1_facebook_likes and 1 other fields | High correlation |
num_user_for_reviews is highly correlated with num_critic_for_reviews and 2 other fields | High correlation |
actor_2_facebook_likes is highly correlated with actor_3_facebook_likes and 1 other fields | High correlation |
movie_facebook_likes is highly correlated with num_critic_for_reviews and 1 other fields | High correlation |
num_critic_for_reviews is highly correlated with num_voted_users and 1 other fields | High correlation |
actor_3_facebook_likes is highly correlated with actor_1_facebook_likes and 2 other fields | High correlation |
actor_1_facebook_likes is highly correlated with actor_3_facebook_likes and 2 other fields | High correlation |
gross is highly correlated with num_voted_users and 2 other fields | High correlation |
num_voted_users is highly correlated with num_critic_for_reviews and 3 other fields | High correlation |
cast_total_facebook_likes is highly correlated with actor_3_facebook_likes and 2 other fields | High correlation |
num_user_for_reviews is highly correlated with num_critic_for_reviews and 2 other fields | High correlation |
budget is highly correlated with gross and 1 other fields | High correlation |
actor_2_facebook_likes is highly correlated with actor_3_facebook_likes and 2 other fields | High correlation |
num_critic_for_reviews is highly correlated with num_voted_users and 1 other fields | High correlation |
actor_3_facebook_likes is highly correlated with cast_total_facebook_likes and 1 other fields | High correlation |
actor_1_facebook_likes is highly correlated with cast_total_facebook_likes and 1 other fields | High correlation |
gross is highly correlated with num_voted_users | High correlation |
num_voted_users is highly correlated with num_critic_for_reviews and 2 other fields | High correlation |
cast_total_facebook_likes is highly correlated with actor_3_facebook_likes and 2 other fields | High correlation |
num_user_for_reviews is highly correlated with num_critic_for_reviews and 1 other fields | High correlation |
actor_2_facebook_likes is highly correlated with actor_3_facebook_likes and 2 other fields | High correlation |
cast_total_facebook_likes is highly correlated with actor_3_facebook_likes and 2 other fields | High correlation |
actor_3_facebook_likes is highly correlated with cast_total_facebook_likes and 2 other fields | High correlation |
gross is highly correlated with actor_3_facebook_likes and 3 other fields | High correlation |
color is highly correlated with title_year | High correlation |
title_year is highly correlated with color and 1 other fields | High correlation |
duration is highly correlated with country and 3 other fields | High correlation |
country is highly correlated with duration and 2 other fields | High correlation |
content_rating is highly correlated with title_year and 2 other fields | High correlation |
actor_1_facebook_likes is highly correlated with cast_total_facebook_likes | High correlation |
movie_facebook_likes is highly correlated with num_critic_for_reviews and 1 other fields | High correlation |
imdb_score is highly correlated with num_user_for_reviews and 1 other fields | High correlation |
budget is highly correlated with country and 1 other fields | High correlation |
num_user_for_reviews is highly correlated with gross and 3 other fields | High correlation |
aspect_ratio is highly correlated with duration and 1 other fields | High correlation |
language is highly correlated with duration and 2 other fields | High correlation |
num_critic_for_reviews is highly correlated with gross and 3 other fields | High correlation |
actor_2_facebook_likes is highly correlated with cast_total_facebook_likes | High correlation |
num_voted_users is highly correlated with actor_3_facebook_likes and 5 other fields | High correlation |
language is highly correlated with country | High correlation |
country is highly correlated with language | High correlation |
director_name has 103 (2.0%) missing values | Missing |
director_facebook_likes has 104 (2.1%) missing values | Missing |
gross has 884 (17.5%) missing values | Missing |
plot_keywords has 153 (3.0%) missing values | Missing |
content_rating has 303 (6.0%) missing values | Missing |
budget has 492 (9.8%) missing values | Missing |
title_year has 108 (2.1%) missing values | Missing |
aspect_ratio has 329 (6.5%) missing values | Missing |
budget is highly skewed (γ1 = 48.15743539) | Skewed |
movie_title is uniformly distributed | Uniform |
actor_3_name is uniformly distributed | Uniform |
plot_keywords is uniformly distributed | Uniform |
director_facebook_likes has 907 (18.0%) zeros | Zeros |
actor_3_facebook_likes has 89 (1.8%) zeros | Zeros |
facenumber_in_poster has 2152 (42.7%) zeros | Zeros |
actor_2_facebook_likes has 55 (1.1%) zeros | Zeros |
movie_facebook_likes has 2181 (43.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-09-08 08:12:40.314092 |
|---|---|
| Analysis finished | 2021-09-08 08:13:27.199459 |
| Duration | 46.89 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 19 |
| Missing (%) | 0.4% |
| Memory size | 307.2 KiB |
| Color | |
|---|---|
| Black and White | 209 |
Length
| Max length | 16 |
|---|---|
| Median length | 5 |
| Mean length | 5.457603503 |
| Min length | 5 |
Characters and Unicode
| Total characters | 27419 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Color |
|---|---|
| 2nd row | Color |
| 3rd row | Color |
| 4th row | Color |
| 5th row | Color |
Common Values
| Value | Count | Frequency (%) |
| Color | 4815 | |
| Black and White | 209 | 4.1% |
| (Missing) | 19 | 0.4% |
Length
Pie chart
| Value | Count | Frequency (%) |
| color | 4815 | |
| white | 209 | 3.8% |
| and | 209 | 3.8% |
| black | 209 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 9630 | |
| l | 5024 | |
| C | 4815 | |
| r | 4815 | |
| 627 | 2.3% | |
| a | 418 | 1.5% |
| B | 209 | 0.8% |
| c | 209 | 0.8% |
| k | 209 | 0.8% |
| n | 209 | 0.8% |
| Other values (6) | 1254 | 4.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 21559 | |
| Uppercase Letter | 5233 | 19.1% |
| Space Separator | 627 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 9630 | |
| l | 5024 | |
| r | 4815 | |
| a | 418 | 1.9% |
| c | 209 | 1.0% |
| k | 209 | 1.0% |
| n | 209 | 1.0% |
| d | 209 | 1.0% |
| h | 209 | 1.0% |
| i | 209 | 1.0% |
| Other values (2) | 418 | 1.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 4815 | |
| B | 209 | 4.0% |
| W | 209 | 4.0% |
Space Separator
| Value | Count | Frequency (%) |
| 627 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 26792 | |
| Common | 627 | 2.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 9630 | |
| l | 5024 | |
| C | 4815 | |
| r | 4815 | |
| a | 418 | 1.6% |
| B | 209 | 0.8% |
| c | 209 | 0.8% |
| k | 209 | 0.8% |
| n | 209 | 0.8% |
| d | 209 | 0.8% |
| Other values (5) | 1045 | 3.9% |
Common
| Value | Count | Frequency (%) |
| 627 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 27419 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 9630 | |
| l | 5024 | |
| C | 4815 | |
| r | 4815 | |
| 627 | 2.3% | |
| a | 418 | 1.5% |
| B | 209 | 0.8% |
| c | 209 | 0.8% |
| k | 209 | 0.8% |
| n | 209 | 0.8% |
| Other values (6) | 1254 | 4.6% |
| Distinct | 2399 |
|---|---|
| Distinct (%) | 48.6% |
| Missing | 103 |
| Missing (%) | 2.0% |
| Memory size | 344.2 KiB |
| Steven Spielberg | 26 |
|---|---|
| Woody Allen | 22 |
| Martin Scorsese | 20 |
| Clint Eastwood | 20 |
| Ridley Scott | 17 |
| Other values (2394) |
Length
| Max length | 32 |
|---|---|
| Median length | 13 |
| Mean length | 13.08421053 |
| Min length | 3 |
Characters and Unicode
| Total characters | 64636 |
|---|---|
| Distinct characters | 76 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 1505 ? |
|---|---|
| Unique (%) | 30.5% |
Sample
| 1st row | Tara Subkoff |
|---|---|
| 2nd row | Jaume Balagueró |
| 3rd row | Jaume Balagueró |
| 4th row | Dan Trachtenberg |
| 5th row | Timothy Hines |
Common Values
| Value | Count | Frequency (%) |
| Steven Spielberg | 26 | 0.5% |
| Woody Allen | 22 | 0.4% |
| Martin Scorsese | 20 | 0.4% |
| Clint Eastwood | 20 | 0.4% |
| Ridley Scott | 17 | 0.3% |
| Spike Lee | 16 | 0.3% |
| Tim Burton | 16 | 0.3% |
| Steven Soderbergh | 16 | 0.3% |
| Renny Harlin | 15 | 0.3% |
| Oliver Stone | 14 | 0.3% |
| Other values (2389) | 4758 | |
| (Missing) | 103 | 2.0% |
Length
| Value | Count | Frequency (%) |
| john | 180 | 1.8% |
| david | 150 | 1.5% |
| michael | 127 | 1.2% |
| james | 87 | 0.8% |
| peter | 85 | 0.8% |
| robert | 84 | 0.8% |
| paul | 81 | 0.8% |
| richard | 80 | 0.8% |
| scott | 65 | 0.6% |
| lee | 58 | 0.6% |
| Other values (2967) | 9279 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6098 | 9.4% |
| 5336 | 8.3% | |
| a | 5279 | 8.2% |
| n | 4658 | 7.2% |
| r | 4449 | 6.9% |
| o | 3795 | 5.9% |
| i | 3693 | 5.7% |
| l | 2970 | 4.6% |
| t | 2321 | 3.6% |
| s | 2089 | 3.2% |
| Other values (66) | 23948 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 48458 | |
| Uppercase Letter | 10495 | 16.2% |
| Space Separator | 5336 | 8.3% |
| Other Punctuation | 260 | 0.4% |
| Dash Punctuation | 87 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 6098 | |
| a | 5279 | |
| n | 4658 | |
| r | 4449 | 9.2% |
| o | 3795 | 7.8% |
| i | 3693 | 7.6% |
| l | 2970 | 6.1% |
| t | 2321 | 4.8% |
| s | 2089 | 4.3% |
| h | 1851 | 3.8% |
| Other values (31) | 11255 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 999 | 9.5% |
| J | 925 | 8.8% |
| M | 886 | 8.4% |
| R | 758 | 7.2% |
| C | 712 | 6.8% |
| B | 678 | 6.5% |
| D | 619 | 5.9% |
| A | 569 | 5.4% |
| L | 499 | 4.8% |
| P | 488 | 4.6% |
| Other values (21) | 3362 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 239 | |
| ' | 21 | 8.1% |
Space Separator
| Value | Count | Frequency (%) |
| 5336 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 87 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 58953 | |
| Common | 5683 | 8.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 6098 | 10.3% |
| a | 5279 | 9.0% |
| n | 4658 | 7.9% |
| r | 4449 | 7.5% |
| o | 3795 | 6.4% |
| i | 3693 | 6.3% |
| l | 2970 | 5.0% |
| t | 2321 | 3.9% |
| s | 2089 | 3.5% |
| h | 1851 | 3.1% |
| Other values (62) | 21750 |
Common
| Value | Count | Frequency (%) |
| 5336 | ||
| . | 239 | 4.2% |
| - | 87 | 1.5% |
| ' | 21 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 64494 | |
| Latin 1 Sup | 142 | 0.2% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 6098 | 9.5% |
| 5336 | 8.3% | |
| a | 5279 | 8.2% |
| n | 4658 | 7.2% |
| r | 4449 | 6.9% |
| o | 3795 | 5.9% |
| i | 3693 | 5.7% |
| l | 2970 | 4.6% |
| t | 2321 | 3.6% |
| s | 2089 | 3.2% |
| Other values (46) | 23806 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 45 | |
| á | 19 | |
| ó | 16 | 11.3% |
| ö | 16 | 11.3% |
| í | 8 | 5.6% |
| ñ | 7 | 4.9% |
| å | 6 | 4.2% |
| ç | 5 | 3.5% |
| É | 3 | 2.1% |
| ä | 2 | 1.4% |
| Other values (10) | 15 | 10.6% |
num_critic_for_reviews
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 528 |
|---|---|
| Distinct (%) | 10.6% |
| Missing | 50 |
| Missing (%) | 1.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 140.194272 |
| Minimum | 1 |
|---|---|
| Maximum | 813 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 50 |
| median | 110 |
| Q3 | 195 |
| 95-th percentile | 387 |
| Maximum | 813 |
| Range | 812 |
| Interquartile range (IQR) | 145 |
Descriptive statistics
| Standard deviation | 121.6016754 |
|---|---|
| Coefficient of variation (CV) | 0.8673797701 |
| Kurtosis | 2.91341641 |
| Mean | 140.194272 |
| Median Absolute Deviation (MAD) | 68 |
| Skewness | 1.5165327 |
| Sum | 699990 |
| Variance | 14786.96746 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 43 | 0.9% |
| 9 | 37 | 0.7% |
| 5 | 36 | 0.7% |
| 10 | 35 | 0.7% |
| 8 | 35 | 0.7% |
| 12 | 34 | 0.7% |
| 16 | 33 | 0.7% |
| 81 | 33 | 0.7% |
| 43 | 31 | 0.6% |
| 29 | 30 | 0.6% |
| Other values (518) | 4646 | |
| (Missing) | 50 | 1.0% |
| Value | Count | Frequency (%) |
| 1 | 43 | |
| 2 | 26 | |
| 3 | 24 | |
| 4 | 29 | |
| 5 | 36 | |
| 6 | 28 | |
| 7 | 23 | |
| 8 | 35 | |
| 9 | 37 | |
| 10 | 35 |
| Value | Count | Frequency (%) |
| 813 | 1 | |
| 775 | 1 | |
| 765 | 1 | |
| 750 | 2 | |
| 739 | 1 | |
| 738 | 1 | |
| 733 | 1 | |
| 723 | 1 | |
| 712 | 1 | |
| 703 | 2 |
| Distinct | 192 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 107.2427435 |
| Minimum | 7 |
|---|---|
| Maximum | 511 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 81 |
| Q1 | 93 |
| median | 103 |
| Q3 | 118 |
| 95-th percentile | 146 |
| Maximum | 511 |
| Range | 504 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 25.49736947 |
|---|---|
| Coefficient of variation (CV) | 0.237753797 |
| Kurtosis | 23.93554408 |
| Mean | 107.2427435 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 2.489825518 |
| Sum | 539431 |
| Variance | 650.1158501 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 161 | 3.2% |
| 100 | 141 | 2.8% |
| 101 | 139 | 2.8% |
| 98 | 135 | 2.7% |
| 97 | 131 | 2.6% |
| 93 | 129 | 2.6% |
| 99 | 124 | 2.5% |
| 95 | 124 | 2.5% |
| 94 | 124 | 2.5% |
| 96 | 113 | 2.2% |
| Other values (182) | 3709 |
| Value | Count | Frequency (%) |
| 7 | 2 | < 0.1% |
| 11 | 1 | < 0.1% |
| 14 | 1 | < 0.1% |
| 20 | 1 | < 0.1% |
| 22 | 7 | |
| 23 | 2 | < 0.1% |
| 24 | 2 | < 0.1% |
| 25 | 4 | |
| 27 | 1 | < 0.1% |
| 28 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 511 | 1 | |
| 379 | 1 | |
| 334 | 1 | |
| 330 | 1 | |
| 325 | 1 | |
| 300 | 1 | |
| 293 | 1 | |
| 289 | 1 | |
| 286 | 1 | |
| 280 | 1 |
| Distinct | 435 |
|---|---|
| Distinct (%) | 8.8% |
| Missing | 104 |
| Missing (%) | 2.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 686.5092124 |
| Minimum | 0 |
|---|---|
| Maximum | 23000 |
| Zeros | 907 |
| Zeros (%) | 18.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 7 |
| median | 49 |
| Q3 | 194.5 |
| 95-th percentile | 973 |
| Maximum | 23000 |
| Range | 23000 |
| Interquartile range (IQR) | 187.5 |
Descriptive statistics
| Standard deviation | 2813.328607 |
|---|---|
| Coefficient of variation (CV) | 4.098020181 |
| Kurtosis | 27.25628935 |
| Mean | 686.5092124 |
| Median Absolute Deviation (MAD) | 49 |
| Skewness | 5.22970117 |
| Sum | 3390669 |
| Variance | 7914817.85 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 907 | 18.0% |
| 3 | 70 | 1.4% |
| 6 | 66 | 1.3% |
| 7 | 64 | 1.3% |
| 2 | 63 | 1.2% |
| 4 | 60 | 1.2% |
| 11 | 59 | 1.2% |
| 10 | 53 | 1.1% |
| 8 | 52 | 1.0% |
| 5 | 52 | 1.0% |
| Other values (425) | 3493 | |
| (Missing) | 104 | 2.1% |
| Value | Count | Frequency (%) |
| 0 | 907 | |
| 2 | 63 | 1.2% |
| 3 | 70 | 1.4% |
| 4 | 60 | 1.2% |
| 5 | 52 | 1.0% |
| 6 | 66 | 1.3% |
| 7 | 64 | 1.3% |
| 8 | 52 | 1.0% |
| 9 | 49 | 1.0% |
| 10 | 53 | 1.1% |
| Value | Count | Frequency (%) |
| 23000 | 1 | < 0.1% |
| 22000 | 8 | 0.2% |
| 21000 | 10 | 0.2% |
| 20000 | 1 | < 0.1% |
| 18000 | 4 | 0.1% |
| 17000 | 20 | |
| 16000 | 28 | |
| 15000 | 2 | < 0.1% |
| 14000 | 30 | |
| 13000 | 26 |
actor_3_facebook_likes
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 906 |
|---|---|
| Distinct (%) | 18.0% |
| Missing | 23 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 645.009761 |
| Minimum | 0 |
|---|---|
| Maximum | 23000 |
| Zeros | 89 |
| Zeros (%) | 1.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10 |
| Q1 | 133 |
| median | 371.5 |
| Q3 | 636 |
| 95-th percentile | 1000 |
| Maximum | 23000 |
| Range | 23000 |
| Interquartile range (IQR) | 503 |
Descriptive statistics
| Standard deviation | 1665.041728 |
|---|---|
| Coefficient of variation (CV) | 2.581420979 |
| Kurtosis | 60.56388811 |
| Mean | 645.009761 |
| Median Absolute Deviation (MAD) | 248.5 |
| Skewness | 7.279020793 |
| Sum | 3237949 |
| Variance | 2772363.957 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1000 | 126 | 2.5% |
| 0 | 89 | 1.8% |
| 11000 | 29 | 0.6% |
| 3 | 28 | 0.6% |
| 2000 | 27 | 0.5% |
| 3000 | 26 | 0.5% |
| 826 | 22 | 0.4% |
| 2 | 21 | 0.4% |
| 7 | 21 | 0.4% |
| 4 | 21 | 0.4% |
| Other values (896) | 4610 | |
| (Missing) | 23 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 89 | |
| 2 | 21 | 0.4% |
| 3 | 28 | 0.6% |
| 4 | 21 | 0.4% |
| 5 | 18 | 0.4% |
| 6 | 18 | 0.4% |
| 7 | 21 | 0.4% |
| 8 | 17 | 0.3% |
| 9 | 16 | 0.3% |
| 10 | 12 | 0.2% |
| Value | Count | Frequency (%) |
| 23000 | 2 | < 0.1% |
| 20000 | 1 | < 0.1% |
| 19000 | 5 | 0.1% |
| 17000 | 1 | < 0.1% |
| 16000 | 3 | 0.1% |
| 15000 | 1 | < 0.1% |
| 14000 | 6 | 0.1% |
| 13000 | 5 | 0.1% |
| 12000 | 8 | 0.2% |
| 11000 | 29 |
| Distinct | 3032 |
|---|---|
| Distinct (%) | 60.3% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Memory size | 347.4 KiB |
| Morgan Freeman | 20 |
|---|---|
| Charlize Theron | 15 |
| Brad Pitt | 14 |
| James Franco | 11 |
| Meryl Streep | 11 |
| Other values (3027) |
Length
| Max length | 28 |
|---|---|
| Median length | 13 |
| Mean length | 13.07435388 |
| Min length | 3 |
Characters and Unicode
| Total characters | 65764 |
|---|---|
| Distinct characters | 80 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 2089 ? |
|---|---|
| Unique (%) | 41.5% |
Sample
| 1st row | Balthazar Getty |
|---|---|
| 2nd row | Pablo Rosso |
| 3rd row | Pablo Rosso |
| 4th row | John Gallagher Jr. |
| 5th row | Kelly LeBrock |
Common Values
| Value | Count | Frequency (%) |
| Morgan Freeman | 20 | 0.4% |
| Charlize Theron | 15 | 0.3% |
| Brad Pitt | 14 | 0.3% |
| James Franco | 11 | 0.2% |
| Meryl Streep | 11 | 0.2% |
| Jason Flemyng | 10 | 0.2% |
| Adam Sandler | 10 | 0.2% |
| Scott Glenn | 9 | 0.2% |
| Steve Buscemi | 9 | 0.2% |
| Judy Greer | 9 | 0.2% |
| Other values (3022) | 4912 | |
| (Missing) | 13 | 0.3% |
Length
| Value | Count | Frequency (%) |
| michael | 102 | 1.0% |
| david | 60 | 0.6% |
| john | 56 | 0.5% |
| james | 53 | 0.5% |
| scott | 52 | 0.5% |
| tom | 50 | 0.5% |
| jason | 44 | 0.4% |
| robert | 44 | 0.4% |
| kevin | 41 | 0.4% |
| thomas | 39 | 0.4% |
| Other values (3825) | 9861 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6221 | 9.5% |
| a | 5930 | 9.0% |
| 5372 | 8.2% | |
| n | 4762 | 7.2% |
| r | 4398 | 6.7% |
| i | 4018 | 6.1% |
| o | 3645 | 5.5% |
| l | 3420 | 5.2% |
| t | 2348 | 3.6% |
| s | 2160 | 3.3% |
| Other values (70) | 23490 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 49447 | |
| Uppercase Letter | 10686 | 16.2% |
| Space Separator | 5372 | 8.2% |
| Other Punctuation | 189 | 0.3% |
| Dash Punctuation | 64 | 0.1% |
| Decimal Number | 6 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 6221 | |
| a | 5930 | |
| n | 4762 | |
| r | 4398 | |
| i | 4018 | 8.1% |
| o | 3645 | 7.4% |
| l | 3420 | 6.9% |
| t | 2348 | 4.7% |
| s | 2160 | 4.4% |
| h | 1796 | 3.6% |
| Other values (38) | 10749 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 999 | 9.3% |
| S | 821 | 7.7% |
| C | 815 | 7.6% |
| B | 773 | 7.2% |
| J | 770 | 7.2% |
| D | 668 | 6.3% |
| A | 640 | 6.0% |
| R | 592 | 5.5% |
| L | 511 | 4.8% |
| T | 463 | 4.3% |
| Other values (16) | 3634 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 124 | |
| ' | 65 |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 3 | |
| 0 | 3 |
Space Separator
| Value | Count | Frequency (%) |
| 5372 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 64 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 60133 | |
| Common | 5631 | 8.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 6221 | 10.3% |
| a | 5930 | 9.9% |
| n | 4762 | 7.9% |
| r | 4398 | 7.3% |
| i | 4018 | 6.7% |
| o | 3645 | 6.1% |
| l | 3420 | 5.7% |
| t | 2348 | 3.9% |
| s | 2160 | 3.6% |
| h | 1796 | 3.0% |
| Other values (64) | 21435 |
Common
| Value | Count | Frequency (%) |
| 5372 | ||
| . | 124 | 2.2% |
| ' | 65 | 1.2% |
| - | 64 | 1.1% |
| 5 | 3 | 0.1% |
| 0 | 3 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 65642 | |
| Latin 1 Sup | 122 | 0.2% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 6221 | 9.5% |
| a | 5930 | 9.0% |
| 5372 | 8.2% | |
| n | 4762 | 7.3% |
| r | 4398 | 6.7% |
| i | 4018 | 6.1% |
| o | 3645 | 5.6% |
| l | 3420 | 5.2% |
| t | 2348 | 3.6% |
| s | 2160 | 3.3% |
| Other values (48) | 23368 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 43 | |
| í | 14 | 11.5% |
| á | 10 | 8.2% |
| ë | 8 | 6.6% |
| ó | 6 | 4.9% |
| ø | 6 | 4.9% |
| å | 5 | 4.1% |
| ü | 4 | 3.3% |
| ö | 3 | 2.5% |
| û | 3 | 2.5% |
| Other values (12) | 20 |
actor_1_facebook_likes
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 878 |
|---|---|
| Distinct (%) | 17.4% |
| Missing | 7 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6560.047061 |
| Minimum | 0 |
|---|---|
| Maximum | 640000 |
| Zeros | 26 |
| Zeros (%) | 0.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 95.5 |
| Q1 | 614 |
| median | 988 |
| Q3 | 11000 |
| 95-th percentile | 24000 |
| Maximum | 640000 |
| Range | 640000 |
| Interquartile range (IQR) | 10386 |
Descriptive statistics
| Standard deviation | 15020.75912 |
|---|---|
| Coefficient of variation (CV) | 2.289733439 |
| Kurtosis | 683.5473559 |
| Mean | 6560.047061 |
| Median Absolute Deviation (MAD) | 752.5 |
| Skewness | 19.12177638 |
| Sum | 33036397 |
| Variance | 225623204.5 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1000 | 449 | 8.9% |
| 11000 | 211 | 4.2% |
| 2000 | 197 | 3.9% |
| 3000 | 155 | 3.1% |
| 12000 | 135 | 2.7% |
| 13000 | 127 | 2.5% |
| 14000 | 123 | 2.4% |
| 10000 | 112 | 2.2% |
| 18000 | 109 | 2.2% |
| 22000 | 82 | 1.6% |
| Other values (868) | 3336 |
| Value | Count | Frequency (%) |
| 0 | 26 | |
| 2 | 8 | 0.2% |
| 3 | 4 | 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 7 | 0.1% |
| 6 | 3 | 0.1% |
| 7 | 3 | 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 3 | 0.1% |
| 10 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 640000 | 1 | < 0.1% |
| 260000 | 3 | 0.1% |
| 164000 | 2 | < 0.1% |
| 137000 | 2 | < 0.1% |
| 87000 | 8 | 0.2% |
| 77000 | 1 | < 0.1% |
| 49000 | 27 | |
| 46000 | 1 | < 0.1% |
| 45000 | 5 | 0.1% |
| 44000 | 2 | < 0.1% |
| Distinct | 4035 |
|---|---|
| Distinct (%) | 97.0% |
| Missing | 884 |
| Missing (%) | 17.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48468407.53 |
| Minimum | 162 |
|---|---|
| Maximum | 760505847 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 162 |
|---|---|
| 5-th percentile | 99034 |
| Q1 | 5340987.5 |
| median | 25517500 |
| Q3 | 62309437.5 |
| 95-th percentile | 180029729.4 |
| Maximum | 760505847 |
| Range | 760505685 |
| Interquartile range (IQR) | 56968450 |
Descriptive statistics
| Standard deviation | 68452990.44 |
|---|---|
| Coefficient of variation (CV) | 1.412321839 |
| Kurtosis | 14.86886885 |
| Mean | 48468407.53 |
| Median Absolute Deviation (MAD) | 23241132 |
| Skewness | 3.127203838 |
| Sum | 2.015801069 × 1011 |
| Variance | 4.6858119 × 1015 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 47000000 | 3 | 0.1% |
| 3000000 | 3 | 0.1% |
| 144512310 | 3 | 0.1% |
| 34964818 | 3 | 0.1% |
| 5773519 | 3 | 0.1% |
| 177343675 | 3 | 0.1% |
| 8000000 | 3 | 0.1% |
| 218051260 | 3 | 0.1% |
| 2000000 | 2 | < 0.1% |
| 35799026 | 2 | < 0.1% |
| Other values (4025) | 4131 | |
| (Missing) | 884 | 17.5% |
| Value | Count | Frequency (%) |
| 162 | 1 | |
| 703 | 1 | |
| 721 | 1 | |
| 728 | 1 | |
| 828 | 1 | |
| 1111 | 1 | |
| 1332 | 1 | |
| 1521 | 1 | |
| 1711 | 1 | |
| 2245 | 1 |
| Value | Count | Frequency (%) |
| 760505847 | 1 | |
| 658672302 | 1 | |
| 652177271 | 1 | |
| 623279547 | 2 | |
| 533316061 | 1 | |
| 474544677 | 1 | |
| 460935665 | 1 | |
| 458991599 | 1 | |
| 448130642 | 1 | |
| 436471036 | 1 |
| Distinct | 914 |
|---|---|
| Distinct (%) | 18.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 380.9 KiB |
| Drama | 236 |
|---|---|
| Comedy | 209 |
| Comedy|Drama | 191 |
| Comedy|Drama|Romance | 187 |
| Comedy|Romance | 158 |
| Other values (909) |
Length
| Max length | 64 |
|---|---|
| Median length | 20 |
| Mean length | 20.31310728 |
| Min length | 5 |
Characters and Unicode
| Total characters | 102439 |
|---|---|
| Distinct characters | 35 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 495 ? |
|---|---|
| Unique (%) | 9.8% |
Sample
| 1st row | Drama|Horror|Mystery|Thriller |
|---|---|
| 2nd row | Horror |
| 3rd row | Horror |
| 4th row | Drama|Horror|Mystery|Sci-Fi|Thriller |
| 5th row | Drama |
Common Values
| Value | Count | Frequency (%) |
| Drama | 236 | 4.7% |
| Comedy | 209 | 4.1% |
| Comedy|Drama | 191 | 3.8% |
| Comedy|Drama|Romance | 187 | 3.7% |
| Comedy|Romance | 158 | 3.1% |
| Drama|Romance | 152 | 3.0% |
| Crime|Drama|Thriller | 101 | 2.0% |
| Horror | 71 | 1.4% |
| Action|Crime|Drama|Thriller | 68 | 1.3% |
| Action|Crime|Thriller | 65 | 1.3% |
| Other values (904) | 3605 |
Length
| Value | Count | Frequency (%) |
| drama | 236 | 4.7% |
| comedy | 209 | 4.1% |
| comedy|drama | 191 | 3.8% |
| comedy|drama|romance | 187 | 3.7% |
| comedy|romance | 158 | 3.1% |
| drama|romance | 152 | 3.0% |
| crime|drama|thriller | 101 | 2.0% |
| horror | 71 | 1.4% |
| action|crime|drama|thriller | 68 | 1.3% |
| action|crime|thriller | 65 | 1.3% |
| Other values (904) | 3605 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 10547 | 10.3% |
| | | 9461 | 9.2% |
| a | 9065 | 8.8% |
| e | 7946 | 7.8% |
| m | 7378 | 7.2% |
| i | 6575 | 6.4% |
| o | 6319 | 6.2% |
| y | 4651 | 4.5% |
| n | 4495 | 4.4% |
| t | 4042 | 3.9% |
| Other values (25) | 31960 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 77222 | |
| Uppercase Letter | 15131 | 14.8% |
| Math Symbol | 9461 | 9.2% |
| Dash Punctuation | 625 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 10547 | |
| a | 9065 | |
| e | 7946 | |
| m | 7378 | |
| i | 6575 | |
| o | 6319 | |
| y | 4651 | 6.0% |
| n | 4495 | 5.8% |
| t | 4042 | 5.2% |
| l | 3508 | 4.5% |
| Other values (9) | 12696 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 2761 | |
| D | 2715 | |
| A | 2318 | |
| F | 1778 | |
| T | 1413 | |
| R | 1109 | |
| M | 846 | 5.6% |
| S | 804 | 5.3% |
| H | 772 | 5.1% |
| W | 310 | 2.0% |
| Other values (4) | 305 | 2.0% |
Math Symbol
| Value | Count | Frequency (%) |
| | | 9461 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 625 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 92353 | |
| Common | 10086 | 9.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 10547 | 11.4% |
| a | 9065 | 9.8% |
| e | 7946 | 8.6% |
| m | 7378 | 8.0% |
| i | 6575 | 7.1% |
| o | 6319 | 6.8% |
| y | 4651 | 5.0% |
| n | 4495 | 4.9% |
| t | 4042 | 4.4% |
| l | 3508 | 3.8% |
| Other values (23) | 27827 |
Common
| Value | Count | Frequency (%) |
| | | 9461 | |
| - | 625 | 6.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 102439 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 10547 | 10.3% |
| | | 9461 | 9.2% |
| a | 9065 | 8.8% |
| e | 7946 | 7.8% |
| m | 7378 | 7.2% |
| i | 6575 | 6.4% |
| o | 6319 | 6.2% |
| y | 4651 | 4.5% |
| n | 4495 | 4.4% |
| t | 4042 | 3.9% |
| Other values (25) | 31960 |
| Distinct | 2097 |
|---|---|
| Distinct (%) | 41.6% |
| Missing | 7 |
| Missing (%) | 0.1% |
| Memory size | 347.3 KiB |
| Robert De Niro | 49 |
|---|---|
| Johnny Depp | 41 |
| Nicolas Cage | 33 |
| J.K. Simmons | 31 |
| Matt Damon | 30 |
| Other values (2092) |
Length
| Max length | 27 |
|---|---|
| Median length | 13 |
| Mean length | 13.19241461 |
| Min length | 4 |
Characters and Unicode
| Total characters | 66437 |
|---|---|
| Distinct characters | 76 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 1360 ? |
|---|---|
| Unique (%) | 27.0% |
Sample
| 1st row | Timothy Hutton |
|---|---|
| 2nd row | Jonathan D. Mellor |
| 3rd row | Manuela Velasco |
| 4th row | Bradley Cooper |
| 5th row | Christopher Lambert |
Common Values
| Value | Count | Frequency (%) |
| Robert De Niro | 49 | 1.0% |
| Johnny Depp | 41 | 0.8% |
| Nicolas Cage | 33 | 0.7% |
| J.K. Simmons | 31 | 0.6% |
| Matt Damon | 30 | 0.6% |
| Bruce Willis | 30 | 0.6% |
| Denzel Washington | 30 | 0.6% |
| Liam Neeson | 29 | 0.6% |
| Harrison Ford | 27 | 0.5% |
| Robin Williams | 27 | 0.5% |
| Other values (2087) | 4709 |
Length
| Value | Count | Frequency (%) |
| robert | 109 | 1.0% |
| tom | 93 | 0.9% |
| michael | 89 | 0.9% |
| jason | 59 | 0.6% |
| de | 57 | 0.5% |
| james | 54 | 0.5% |
| bruce | 51 | 0.5% |
| steve | 50 | 0.5% |
| jr | 49 | 0.5% |
| niro | 49 | 0.5% |
| Other values (2888) | 9784 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6213 | 9.4% |
| a | 5732 | 8.6% |
| 5408 | 8.1% | |
| n | 4818 | 7.3% |
| r | 4311 | 6.5% |
| i | 4249 | 6.4% |
| o | 3918 | 5.9% |
| l | 3312 | 5.0% |
| t | 2569 | 3.9% |
| s | 2349 | 3.5% |
| Other values (66) | 23558 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 50016 | |
| Uppercase Letter | 10711 | 16.1% |
| Space Separator | 5408 | 8.1% |
| Other Punctuation | 227 | 0.3% |
| Dash Punctuation | 73 | 0.1% |
| Decimal Number | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 6213 | |
| a | 5732 | |
| n | 4818 | |
| r | 4311 | |
| i | 4249 | 8.5% |
| o | 3918 | 7.8% |
| l | 3312 | 6.6% |
| t | 2569 | 5.1% |
| s | 2349 | 4.7% |
| h | 1791 | 3.6% |
| Other values (32) | 10754 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 954 | 8.9% |
| M | 912 | 8.5% |
| S | 853 | 8.0% |
| C | 818 | 7.6% |
| B | 741 | 6.9% |
| D | 728 | 6.8% |
| R | 635 | 5.9% |
| H | 524 | 4.9% |
| A | 499 | 4.7% |
| L | 490 | 4.6% |
| Other values (18) | 3557 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 179 | |
| ' | 48 | 21.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 1 | |
| 0 | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 5408 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 73 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 60727 | |
| Common | 5710 | 8.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 6213 | 10.2% |
| a | 5732 | 9.4% |
| n | 4818 | 7.9% |
| r | 4311 | 7.1% |
| i | 4249 | 7.0% |
| o | 3918 | 6.5% |
| l | 3312 | 5.5% |
| t | 2569 | 4.2% |
| s | 2349 | 3.9% |
| h | 1791 | 2.9% |
| Other values (60) | 21465 |
Common
| Value | Count | Frequency (%) |
| 5408 | ||
| . | 179 | 3.1% |
| - | 73 | 1.3% |
| ' | 48 | 0.8% |
| 5 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 66357 | |
| Latin 1 Sup | 80 | 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 6213 | 9.4% |
| a | 5732 | 8.6% |
| 5408 | 8.1% | |
| n | 4818 | 7.3% |
| r | 4311 | 6.5% |
| i | 4249 | 6.4% |
| o | 3918 | 5.9% |
| l | 3312 | 5.0% |
| t | 2569 | 3.9% |
| s | 2349 | 3.5% |
| Other values (48) | 23478 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 20 | |
| ë | 15 | |
| á | 7 | 8.8% |
| í | 6 | 7.5% |
| å | 5 | 6.2% |
| ç | 5 | 6.2% |
| ø | 4 | 5.0% |
| Ó | 3 | 3.8% |
| ü | 2 | 2.5% |
| Á | 2 | 2.5% |
| Other values (8) | 11 |
| Distinct | 4917 |
|---|---|
| Distinct (%) | 97.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 357.9 KiB |
| King Kong | 3 |
|---|---|
| Halloween | 3 |
| Pan | 3 |
| Victor Frankenstein | 3 |
| Home | 3 |
| Other values (4912) |
Length
| Max length | 86 |
|---|---|
| Median length | 14 |
| Mean length | 15.54987111 |
| Min length | 1 |
Characters and Unicode
| Total characters | 78418 |
|---|---|
| Distinct characters | 96 |
| Distinct categories | 13 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 4798 ? |
|---|---|
| Unique (%) | 95.1% |
Sample
| 1st row | #Horror |
|---|---|
| 2nd row | [Rec] 2 |
| 3rd row | [Rec] |
| 4th row | 10 Cloverfield Lane |
| 5th row | 10 Days in a Madhouse |
Common Values
| Value | Count | Frequency (%) |
| King Kong | 3 | 0.1% |
| Halloween | 3 | 0.1% |
| Pan | 3 | 0.1% |
| Victor Frankenstein | 3 | 0.1% |
| Home | 3 | 0.1% |
| Ben-Hur | 3 | 0.1% |
| The Fast and the Furious | 3 | 0.1% |
| The Full Monty | 2 | < 0.1% |
| Dawn of the Dead | 2 | < 0.1% |
| The Jungle Book | 2 | < 0.1% |
| Other values (4907) | 5016 |
Length
| Value | Count | Frequency (%) |
| the | 1606 | 11.5% |
| of | 483 | 3.5% |
| a | 193 | 1.4% |
| and | 150 | 1.1% |
| in | 123 | 0.9% |
| to | 107 | 0.8% |
| 2 | 104 | 0.7% |
| 81 | 0.6% | |
| man | 66 | 0.5% |
| love | 56 | 0.4% |
| Other values (4905) | 10987 |
Most occurring characters
| Value | Count | Frequency (%) |
| 10209 | 13.0% | |
| e | 7898 | 10.1% |
| a | 4859 | 6.2% |
| o | 4669 | 6.0% |
| n | 4141 | 5.3% |
| r | 4135 | 5.3% |
| i | 3933 | 5.0% |
| t | 3818 | 4.9% |
| s | 3007 | 3.8% |
| h | 2975 | 3.8% |
| Other values (86) | 28774 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 54383 | |
| Uppercase Letter | 12232 | 15.6% |
| Space Separator | 10209 | 13.0% |
| Other Punctuation | 952 | 1.2% |
| Decimal Number | 527 | 0.7% |
| Dash Punctuation | 95 | 0.1% |
| Open Punctuation | 5 | < 0.1% |
| Close Punctuation | 5 | < 0.1% |
| Currency Symbol | 4 | < 0.1% |
| Other Number | 2 | < 0.1% |
| Other values (3) | 4 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 7898 | |
| a | 4859 | 8.9% |
| o | 4669 | 8.6% |
| n | 4141 | 7.6% |
| r | 4135 | 7.6% |
| i | 3933 | 7.2% |
| t | 3818 | 7.0% |
| s | 3007 | 5.5% |
| h | 2975 | 5.5% |
| l | 2538 | 4.7% |
| Other values (25) | 12410 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1724 | |
| S | 1054 | 8.6% |
| M | 821 | 6.7% |
| B | 778 | 6.4% |
| D | 727 | 5.9% |
| C | 687 | 5.6% |
| A | 664 | 5.4% |
| L | 580 | 4.7% |
| H | 569 | 4.7% |
| W | 505 | 4.1% |
| Other values (17) | 4123 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 371 | |
| ' | 231 | |
| . | 145 | 15.2% |
| , | 79 | 8.3% |
| & | 61 | 6.4% |
| ! | 32 | 3.4% |
| ? | 16 | 1.7% |
| / | 8 | 0.8% |
| * | 5 | 0.5% |
| # | 2 | 0.2% |
| Other values (2) | 2 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 147 | |
| 0 | 87 | |
| 3 | 87 | |
| 1 | 82 | |
| 4 | 35 | 6.6% |
| 8 | 22 | 4.2% |
| 5 | 21 | 4.0% |
| 9 | 17 | 3.2% |
| 7 | 15 | 2.8% |
| 6 | 14 | 2.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3 | |
| [ | 2 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3 | |
| ] | 2 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 2 | |
| ¢ | 2 |
Space Separator
| Value | Count | Frequency (%) |
| 10209 |
Other Number
| Value | Count | Frequency (%) |
| ½ | 2 |
Other Symbol
| Value | Count | Frequency (%) |
| ° | 1 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 95 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 2 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 66615 | |
| Common | 11803 | 15.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 7898 | 11.9% |
| a | 4859 | 7.3% |
| o | 4669 | 7.0% |
| n | 4141 | 6.2% |
| r | 4135 | 6.2% |
| i | 3933 | 5.9% |
| t | 3818 | 5.7% |
| s | 3007 | 4.5% |
| h | 2975 | 4.5% |
| l | 2538 | 3.8% |
| Other values (52) | 24642 |
Common
| Value | Count | Frequency (%) |
| 10209 | ||
| : | 371 | 3.1% |
| ' | 231 | 2.0% |
| 2 | 147 | 1.2% |
| . | 145 | 1.2% |
| - | 95 | 0.8% |
| 0 | 87 | 0.7% |
| 3 | 87 | 0.7% |
| 1 | 82 | 0.7% |
| , | 79 | 0.7% |
| Other values (24) | 270 | 2.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 78395 | |
| Latin 1 Sup | 23 | < 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 10209 | 13.0% | |
| e | 7898 | 10.1% |
| a | 4859 | 6.2% |
| o | 4669 | 6.0% |
| n | 4141 | 5.3% |
| r | 4135 | 5.3% |
| i | 3933 | 5.0% |
| t | 3818 | 4.9% |
| s | 3007 | 3.8% |
| h | 2975 | 3.8% |
| Other values (72) | 28751 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 8 | |
| ½ | 2 | 8.7% |
| ¢ | 2 | 8.7% |
| Æ | 1 | 4.3% |
| ° | 1 | 4.3% |
| ü | 1 | 4.3% |
| í | 1 | 4.3% |
| ó | 1 | 4.3% |
| à | 1 | 4.3% |
| ä | 1 | 4.3% |
| Other values (4) | 4 |
num_voted_users
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 4826 |
|---|---|
| Distinct (%) | 95.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83668.16082 |
| Minimum | 5 |
|---|---|
| Maximum | 1689764 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 514.6 |
| Q1 | 8593.5 |
| median | 34359 |
| Q3 | 96309 |
| 95-th percentile | 332254.9 |
| Maximum | 1689764 |
| Range | 1689759 |
| Interquartile range (IQR) | 87715.5 |
Descriptive statistics
| Standard deviation | 138485.2568 |
|---|---|
| Coefficient of variation (CV) | 1.655172714 |
| Kurtosis | 24.44552017 |
| Mean | 83668.16082 |
| Median Absolute Deviation (MAD) | 30816 |
| Skewness | 4.029871144 |
| Sum | 421938535 |
| Variance | 1.917816635 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 57 | 5 | 0.1% |
| 6 | 4 | 0.1% |
| 374 | 3 | 0.1% |
| 62 | 3 | 0.1% |
| 53 | 3 | 0.1% |
| 2541 | 3 | 0.1% |
| 38 | 3 | 0.1% |
| 6025 | 3 | 0.1% |
| 162 | 3 | 0.1% |
| 3119 | 3 | 0.1% |
| Other values (4816) | 5010 |
| Value | Count | Frequency (%) |
| 5 | 2 | |
| 6 | 4 | |
| 7 | 2 | |
| 8 | 3 | |
| 10 | 1 | < 0.1% |
| 13 | 1 | < 0.1% |
| 15 | 2 | |
| 16 | 1 | < 0.1% |
| 18 | 2 | |
| 19 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1689764 | 1 | |
| 1676169 | 1 | |
| 1468200 | 1 | |
| 1347461 | 1 | |
| 1324680 | 1 | |
| 1251222 | 1 | |
| 1238746 | 1 | |
| 1217752 | 1 | |
| 1215718 | 1 | |
| 1155770 | 1 |
cast_total_facebook_likes
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 3978 |
|---|---|
| Distinct (%) | 78.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9699.063851 |
| Minimum | 0 |
|---|---|
| Maximum | 656730 |
| Zeros | 33 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 179 |
| Q1 | 1411 |
| median | 3090 |
| Q3 | 13756.5 |
| 95-th percentile | 36927.7 |
| Maximum | 656730 |
| Range | 656730 |
| Interquartile range (IQR) | 12345.5 |
Descriptive statistics
| Standard deviation | 18163.79912 |
|---|---|
| Coefficient of variation (CV) | 1.872737349 |
| Kurtosis | 361.2551153 |
| Mean | 9699.063851 |
| Median Absolute Deviation (MAD) | 2302 |
| Skewness | 12.83192773 |
| Sum | 48912379 |
| Variance | 329923598.6 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 33 | 0.7% |
| 5 | 7 | 0.1% |
| 2020 | 6 | 0.1% |
| 2 | 6 | 0.1% |
| 673 | 5 | 0.1% |
| 1044 | 5 | 0.1% |
| 29 | 5 | 0.1% |
| 679 | 4 | 0.1% |
| 15 | 4 | 0.1% |
| 81 | 4 | 0.1% |
| Other values (3968) | 4964 |
| Value | Count | Frequency (%) |
| 0 | 33 | |
| 2 | 6 | 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 7 | 0.1% |
| 6 | 2 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 2 | < 0.1% |
| 10 | 1 | < 0.1% |
| 11 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 656730 | 1 | |
| 303717 | 1 | |
| 283939 | 1 | |
| 263584 | 1 | |
| 261818 | 1 | |
| 170118 | 1 | |
| 140268 | 1 | |
| 137712 | 1 | |
| 120797 | 1 | |
| 108016 | 1 |
| Distinct | 3521 |
|---|---|
| Distinct (%) | 70.1% |
| Missing | 23 |
| Missing (%) | 0.5% |
| Memory size | 347.3 KiB |
| Steve Coogan | 8 |
|---|---|
| Ben Mendelsohn | 8 |
| John Heard | 8 |
| Robert Duvall | 7 |
| Stephen Root | 7 |
| Other values (3516) |
Length
| Max length | 29 |
|---|---|
| Median length | 13 |
| Mean length | 13.08227092 |
| Min length | 3 |
Characters and Unicode
| Total characters | 65673 |
|---|---|
| Distinct characters | 81 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 2648 ? |
|---|---|
| Unique (%) | 52.7% |
Sample
| 1st row | Lydia Hearst |
|---|---|
| 2nd row | Andrea Ros |
| 3rd row | Carlos Lasarte |
| 4th row | Sumalee Montano |
| 5th row | Alexandra Callas |
Common Values
| Value | Count | Frequency (%) |
| Steve Coogan | 8 | 0.2% |
| Ben Mendelsohn | 8 | 0.2% |
| John Heard | 8 | 0.2% |
| Robert Duvall | 7 | 0.1% |
| Stephen Root | 7 | 0.1% |
| Sam Shepard | 7 | 0.1% |
| Jon Gries | 7 | 0.1% |
| Kirsten Dunst | 7 | 0.1% |
| Lois Maxwell | 7 | 0.1% |
| Anne Hathaway | 7 | 0.1% |
| Other values (3511) | 4947 | |
| (Missing) | 23 | 0.5% |
Length
| Value | Count | Frequency (%) |
| michael | 86 | 0.8% |
| john | 80 | 0.8% |
| david | 70 | 0.7% |
| james | 69 | 0.7% |
| robert | 46 | 0.4% |
| tom | 43 | 0.4% |
| paul | 42 | 0.4% |
| kevin | 41 | 0.4% |
| peter | 38 | 0.4% |
| steve | 36 | 0.3% |
| Other values (4307) | 9842 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 6190 | 9.4% |
| a | 5995 | 9.1% |
| 5373 | 8.2% | |
| n | 4589 | 7.0% |
| r | 4183 | 6.4% |
| i | 3975 | 6.1% |
| o | 3584 | 5.5% |
| l | 3508 | 5.3% |
| t | 2354 | 3.6% |
| s | 2343 | 3.6% |
| Other values (71) | 23579 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 49295 | |
| Uppercase Letter | 10690 | 16.3% |
| Space Separator | 5373 | 8.2% |
| Other Punctuation | 234 | 0.4% |
| Dash Punctuation | 79 | 0.1% |
| Decimal Number | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 6190 | |
| a | 5995 | |
| n | 4589 | |
| r | 4183 | 8.5% |
| i | 3975 | 8.1% |
| o | 3584 | 7.3% |
| l | 3508 | 7.1% |
| t | 2354 | 4.8% |
| s | 2343 | 4.8% |
| h | 1857 | 3.8% |
| Other values (34) | 10717 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 986 | 9.2% |
| J | 832 | 7.8% |
| S | 830 | 7.8% |
| B | 806 | 7.5% |
| C | 792 | 7.4% |
| D | 653 | 6.1% |
| R | 615 | 5.8% |
| A | 589 | 5.5% |
| L | 536 | 5.0% |
| K | 464 | 4.3% |
| Other values (21) | 3587 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 171 | |
| ' | 63 | 26.9% |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 1 | |
| 0 | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 5373 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 79 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59985 | |
| Common | 5688 | 8.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 6190 | 10.3% |
| a | 5995 | 10.0% |
| n | 4589 | 7.7% |
| r | 4183 | 7.0% |
| i | 3975 | 6.6% |
| o | 3584 | 6.0% |
| l | 3508 | 5.8% |
| t | 2354 | 3.9% |
| s | 2343 | 3.9% |
| h | 1857 | 3.1% |
| Other values (65) | 21407 |
Common
| Value | Count | Frequency (%) |
| 5373 | ||
| . | 171 | 3.0% |
| - | 79 | 1.4% |
| ' | 63 | 1.1% |
| 5 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 65537 | |
| Latin 1 Sup | 136 | 0.2% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 6190 | 9.4% |
| a | 5995 | 9.1% |
| 5373 | 8.2% | |
| n | 4589 | 7.0% |
| r | 4183 | 6.4% |
| i | 3975 | 6.1% |
| o | 3584 | 5.5% |
| l | 3508 | 5.4% |
| t | 2354 | 3.6% |
| s | 2343 | 3.6% |
| Other values (48) | 23443 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| é | 49 | |
| í | 14 | 10.3% |
| á | 13 | 9.6% |
| ó | 9 | 6.6% |
| ë | 7 | 5.1% |
| ü | 7 | 5.1% |
| à | 6 | 4.4% |
| è | 4 | 2.9% |
| ç | 3 | 2.2% |
| å | 3 | 2.2% |
| Other values (13) | 21 |
| Distinct | 19 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.371172962 |
| Minimum | 0 |
|---|---|
| Maximum | 43 |
| Zeros | 2152 |
| Zeros (%) | 42.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 43 |
| Range | 43 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.01357592 |
|---|---|
| Coefficient of variation (CV) | 1.468506144 |
| Kurtosis | 52.03373533 |
| Mean | 1.371172962 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 4.384765939 |
| Sum | 6897 |
| Variance | 4.054487986 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2152 | |
| 1 | 1251 | |
| 2 | 716 | 14.2% |
| 3 | 380 | 7.5% |
| 4 | 207 | 4.1% |
| 5 | 114 | 2.3% |
| 6 | 76 | 1.5% |
| 7 | 48 | 1.0% |
| 8 | 37 | 0.7% |
| 9 | 18 | 0.4% |
| Other values (9) | 31 | 0.6% |
| (Missing) | 13 | 0.3% |
| Value | Count | Frequency (%) |
| 0 | 2152 | |
| 1 | 1251 | |
| 2 | 716 | 14.2% |
| 3 | 380 | 7.5% |
| 4 | 207 | 4.1% |
| 5 | 114 | 2.3% |
| 6 | 76 | 1.5% |
| 7 | 48 | 1.0% |
| 8 | 37 | 0.7% |
| 9 | 18 | 0.4% |
| Value | Count | Frequency (%) |
| 43 | 1 | < 0.1% |
| 31 | 1 | < 0.1% |
| 19 | 1 | < 0.1% |
| 15 | 6 | 0.1% |
| 14 | 1 | < 0.1% |
| 13 | 2 | < 0.1% |
| 12 | 4 | 0.1% |
| 11 | 5 | 0.1% |
| 10 | 10 | |
| 9 | 18 |
| Distinct | 4760 |
|---|---|
| Distinct (%) | 97.3% |
| Missing | 153 |
| Missing (%) | 3.0% |
| Memory size | 527.5 KiB |
| based on novel | 4 |
|---|---|
| assistant|experiment|frankenstein|medical student|scientist | 3 |
| one word title | 3 |
| alien friendship|alien invasion|australia|flying car|mother daughter relationship | 3 |
| 1940s|child hero|fantasy world|orphan|reference to peter pan | 3 |
| Other values (4755) |
Length
| Max length | 149 |
|---|---|
| Median length | 50 |
| Mean length | 52.43312883 |
| Min length | 2 |
Characters and Unicode
| Total characters | 256398 |
|---|---|
| Distinct characters | 42 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 4639 ? |
|---|---|
| Unique (%) | 94.9% |
Sample
| 1st row | bullying|cyberbullying|girl|internet|throat slitting |
|---|---|
| 2nd row | apartment|apartment building|blood sample|crucifix|zombie |
| 3rd row | apartment building|character's point of view camera shot|fire station|subjective camera|television reporter |
| 4th row | alien|bunker|car crash|kidnapping|minimal cast |
| 5th row | dating|protective father|school|shrew|teen movie |
Common Values
| Value | Count | Frequency (%) |
| based on novel | 4 | 0.1% |
| assistant|experiment|frankenstein|medical student|scientist | 3 | 0.1% |
| one word title | 3 | 0.1% |
| alien friendship|alien invasion|australia|flying car|mother daughter relationship | 3 | 0.1% |
| 1940s|child hero|fantasy world|orphan|reference to peter pan | 3 | 0.1% |
| halloween|masked killer|michael myers|slasher|trick or treat | 3 | 0.1% |
| eighteen wheeler|illegal street racing|truck|trucker|undercover cop | 3 | 0.1% |
| animal name in title|ape abducts a woman|gorilla|island|king kong | 3 | 0.1% |
| ghost|haunted|haunting|house|paranormal investigator | 2 | < 0.1% |
| famous line|hand to hand combat|kraken|rape|zeus | 2 | < 0.1% |
| Other values (4750) | 4861 | |
| (Missing) | 153 | 3.0% |
Length
| Value | Count | Frequency (%) |
| in | 331 | 1.8% |
| of | 222 | 1.2% |
| on | 209 | 1.2% |
| the | 191 | 1.1% |
| a | 185 | 1.0% |
| to | 180 | 1.0% |
| york | 122 | 0.7% |
| based | 106 | 0.6% |
| female | 104 | 0.6% |
| by | 99 | 0.5% |
| Other values (11486) | 16269 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 24818 | 9.7% |
| a | 19577 | 7.6% |
| | | 19207 | 7.5% |
| i | 18742 | 7.3% |
| r | 18124 | 7.1% |
| t | 16182 | 6.3% |
| n | 15662 | 6.1% |
| o | 15480 | 6.0% |
| s | 13297 | 5.2% |
| 13128 | 5.1% | |
| Other values (32) | 82181 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 222711 | |
| Math Symbol | 19207 | 7.5% |
| Space Separator | 13128 | 5.1% |
| Decimal Number | 1131 | 0.4% |
| Other Punctuation | 219 | 0.1% |
| Open Punctuation | 1 | < 0.1% |
| Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 24818 | |
| a | 19577 | 8.8% |
| i | 18742 | 8.4% |
| r | 18124 | 8.1% |
| t | 16182 | 7.3% |
| n | 15662 | 7.0% |
| o | 15480 | 7.0% |
| s | 13297 | 6.0% |
| l | 11203 | 5.0% |
| c | 9463 | 4.2% |
| Other values (16) | 60163 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 284 | |
| 0 | 270 | |
| 9 | 222 | |
| 2 | 81 | 7.2% |
| 8 | 65 | 5.7% |
| 7 | 49 | 4.3% |
| 5 | 47 | 4.2% |
| 3 | 44 | 3.9% |
| 6 | 38 | 3.4% |
| 4 | 31 | 2.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 130 | |
| ' | 89 |
Math Symbol
| Value | Count | Frequency (%) |
| | | 19207 |
Space Separator
| Value | Count | Frequency (%) |
| 13128 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 222711 | |
| Common | 33687 | 13.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 24818 | |
| a | 19577 | 8.8% |
| i | 18742 | 8.4% |
| r | 18124 | 8.1% |
| t | 16182 | 7.3% |
| n | 15662 | 7.0% |
| o | 15480 | 7.0% |
| s | 13297 | 6.0% |
| l | 11203 | 5.0% |
| c | 9463 | 4.2% |
| Other values (16) | 60163 |
Common
| Value | Count | Frequency (%) |
| | | 19207 | |
| 13128 | ||
| 1 | 284 | 0.8% |
| 0 | 270 | 0.8% |
| 9 | 222 | 0.7% |
| . | 130 | 0.4% |
| ' | 89 | 0.3% |
| 2 | 81 | 0.2% |
| 8 | 65 | 0.2% |
| 7 | 49 | 0.1% |
| Other values (6) | 162 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 256398 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 24818 | 9.7% |
| a | 19577 | 7.6% |
| | | 19207 | 7.5% |
| i | 18742 | 7.3% |
| r | 18124 | 7.1% |
| t | 16182 | 6.3% |
| n | 15662 | 6.1% |
| o | 15480 | 6.0% |
| s | 13297 | 5.2% |
| 13128 | 5.1% | |
| Other values (32) | 82181 |
| Distinct | 4919 |
|---|---|
| Distinct (%) | 97.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 536.9 KiB |
| http://www.imdb.com/title/tt2224026/?ref_=fn_tt_tt_1 | 3 |
|---|---|
| http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_1 | 3 |
| http://www.imdb.com/title/tt1976009/?ref_=fn_tt_tt_1 | 3 |
| http://www.imdb.com/title/tt2638144/?ref_=fn_tt_tt_1 | 3 |
| http://www.imdb.com/title/tt0232500/?ref_=fn_tt_tt_1 | 3 |
| Other values (4914) |
| Value | Count | Frequency (%) |
| http://www.imdb.com/title/tt2224026/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt0360717/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt1976009/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt2638144/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt0232500/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt0077651/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt3332064/?ref_=fn_tt_tt_1 | 3 | 0.1% |
| http://www.imdb.com/title/tt0467406/?ref_=fn_tt_tt_1 | 2 | < 0.1% |
| http://www.imdb.com/title/tt1502712/?ref_=fn_tt_tt_1 | 2 | < 0.1% |
| http://www.imdb.com/title/tt0082517/?ref_=fn_tt_tt_1 | 2 | < 0.1% |
| Other values (4909) | 5016 |
| Value | Count | Frequency (%) |
| http | 5043 |
| Value | Count | Frequency (%) |
| www.imdb.com | 5043 |
| Value | Count | Frequency (%) |
| /title/tt0077651/ | 3 | 0.1% |
| /title/tt3332064/ | 3 | 0.1% |
| /title/tt2224026/ | 3 | 0.1% |
| /title/tt0232500/ | 3 | 0.1% |
| /title/tt1976009/ | 3 | 0.1% |
| /title/tt0360717/ | 3 | 0.1% |
| /title/tt2638144/ | 3 | 0.1% |
| /title/tt1939659/ | 2 | < 0.1% |
| /title/tt0363547/ | 2 | < 0.1% |
| /title/tt0443543/ | 2 | < 0.1% |
| Other values (4909) | 5016 |
| Value | Count | Frequency (%) |
| ref_=fn_tt_tt_1 | 5043 |
| Value | Count | Frequency (%) |
| 5043 |
num_user_for_reviews
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 954 |
|---|---|
| Distinct (%) | 19.0% |
| Missing | 21 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 272.7708084 |
| Minimum | 1 |
|---|---|
| Maximum | 5060 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 10 |
| Q1 | 65 |
| median | 156 |
| Q3 | 326 |
| 95-th percentile | 907.8 |
| Maximum | 5060 |
| Range | 5059 |
| Interquartile range (IQR) | 261 |
Descriptive statistics
| Standard deviation | 377.9828856 |
|---|---|
| Coefficient of variation (CV) | 1.385716044 |
| Kurtosis | 26.43829739 |
| Mean | 272.7708084 |
| Median Absolute Deviation (MAD) | 113 |
| Skewness | 4.121475159 |
| Sum | 1369855 |
| Variance | 142871.0618 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 51 | 1.0% |
| 3 | 33 | 0.7% |
| 2 | 32 | 0.6% |
| 26 | 32 | 0.6% |
| 10 | 29 | 0.6% |
| 6 | 28 | 0.6% |
| 50 | 26 | 0.5% |
| 8 | 25 | 0.5% |
| 32 | 25 | 0.5% |
| 31 | 24 | 0.5% |
| Other values (944) | 4717 |
| Value | Count | Frequency (%) |
| 1 | 51 | |
| 2 | 32 | |
| 3 | 33 | |
| 4 | 23 | |
| 5 | 19 | 0.4% |
| 6 | 28 | |
| 7 | 17 | 0.3% |
| 8 | 25 | |
| 9 | 23 | |
| 10 | 29 |
| Value | Count | Frequency (%) |
| 5060 | 1 | |
| 4667 | 1 | |
| 4144 | 1 | |
| 3646 | 1 | |
| 3597 | 1 | |
| 3516 | 1 | |
| 3400 | 1 | |
| 3286 | 1 | |
| 3189 | 1 | |
| 3054 | 1 |
| Distinct | 47 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 12 |
| Missing (%) | 0.2% |
| Memory size | 314.8 KiB |
| English | |
|---|---|
| French | 73 |
| Spanish | 40 |
| Hindi | 28 |
| Mandarin | 26 |
| Other values (42) | 160 |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 6.980719539 |
| Min length | 4 |
Characters and Unicode
| Total characters | 35120 |
|---|---|
| Distinct characters | 43 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 18 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | English |
|---|---|
| 2nd row | Spanish |
| 3rd row | Spanish |
| 4th row | English |
| 5th row | English |
Common Values
| Value | Count | Frequency (%) |
| English | 4704 | |
| French | 73 | 1.4% |
| Spanish | 40 | 0.8% |
| Hindi | 28 | 0.6% |
| Mandarin | 26 | 0.5% |
| German | 19 | 0.4% |
| Japanese | 18 | 0.4% |
| Cantonese | 11 | 0.2% |
| Russian | 11 | 0.2% |
| Italian | 11 | 0.2% |
| Other values (37) | 90 | 1.8% |
| (Missing) | 12 | 0.2% |
Length
| Value | Count | Frequency (%) |
| english | 4704 | |
| french | 73 | 1.5% |
| spanish | 40 | 0.8% |
| hindi | 28 | 0.6% |
| mandarin | 26 | 0.5% |
| german | 19 | 0.4% |
| japanese | 18 | 0.4% |
| cantonese | 11 | 0.2% |
| italian | 11 | 0.2% |
| russian | 11 | 0.2% |
| Other values (37) | 90 | 1.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 5032 | |
| i | 4906 | |
| h | 4845 | |
| s | 4828 | |
| l | 4731 | |
| g | 4722 | |
| E | 4704 | |
| a | 252 | 0.7% |
| e | 217 | 0.6% |
| r | 160 | 0.5% |
| Other values (33) | 723 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 30089 | |
| Uppercase Letter | 5031 | 14.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 5032 | |
| i | 4906 | |
| h | 4845 | |
| s | 4828 | |
| l | 4731 | |
| g | 4722 | |
| a | 252 | 0.8% |
| e | 217 | 0.7% |
| r | 160 | 0.5% |
| c | 88 | 0.3% |
| Other values (13) | 308 | 1.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 4704 | |
| F | 74 | 1.5% |
| S | 47 | 0.9% |
| H | 34 | 0.7% |
| M | 28 | 0.6% |
| G | 20 | 0.4% |
| J | 18 | 0.4% |
| P | 17 | 0.3% |
| C | 15 | 0.3% |
| I | 15 | 0.3% |
| Other values (10) | 59 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 35120 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 5032 | |
| i | 4906 | |
| h | 4845 | |
| s | 4828 | |
| l | 4731 | |
| g | 4722 | |
| E | 4704 | |
| a | 252 | 0.7% |
| e | 217 | 0.6% |
| r | 160 | 0.5% |
| Other values (33) | 723 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 35120 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 5032 | |
| i | 4906 | |
| h | 4845 | |
| s | 4828 | |
| l | 4731 | |
| g | 4722 | |
| E | 4704 | |
| a | 252 | 0.7% |
| e | 217 | 0.6% |
| r | 160 | 0.5% |
| Other values (33) | 723 | 2.1% |
| Distinct | 63 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 5 |
| Missing (%) | 0.1% |
| Memory size | 297.9 KiB |
| USA | |
|---|---|
| UK | |
| France | 154 |
| Canada | 126 |
| Germany | 97 |
| Other values (58) |
Length
| Max length | 20 |
|---|---|
| Median length | 3 |
| Mean length | 3.486304089 |
| Min length | 2 |
Characters and Unicode
| Total characters | 17564 |
|---|---|
| Distinct characters | 46 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 26 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | USA |
|---|---|
| 2nd row | Spain |
| 3rd row | Spain |
| 4th row | USA |
| 5th row | USA |
Common Values
| Value | Count | Frequency (%) |
| USA | 3809 | |
| UK | 448 | 8.9% |
| France | 154 | 3.1% |
| Canada | 126 | 2.5% |
| Germany | 97 | 1.9% |
| Australia | 55 | 1.1% |
| India | 34 | 0.7% |
| Spain | 33 | 0.7% |
| China | 30 | 0.6% |
| Italy | 23 | 0.5% |
| Other values (53) | 229 | 4.5% |
Length
| Value | Count | Frequency (%) |
| usa | 3809 | |
| uk | 448 | 8.8% |
| france | 154 | 3.0% |
| canada | 126 | 2.5% |
| germany | 100 | 2.0% |
| australia | 55 | 1.1% |
| india | 34 | 0.7% |
| spain | 33 | 0.6% |
| china | 30 | 0.6% |
| japan | 23 | 0.5% |
| Other values (60) | 290 | 5.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| U | 4259 | |
| A | 3879 | |
| S | 3876 | |
| a | 1092 | 6.2% |
| n | 637 | 3.6% |
| K | 481 | 2.7% |
| e | 407 | 2.3% |
| r | 404 | 2.3% |
| i | 245 | 1.4% |
| d | 218 | 1.2% |
| Other values (36) | 2066 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 13168 | |
| Lowercase Letter | 4332 | 24.7% |
| Space Separator | 64 | 0.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1092 | |
| n | 637 | |
| e | 407 | 9.4% |
| r | 404 | 9.3% |
| i | 245 | 5.7% |
| d | 218 | 5.0% |
| c | 192 | 4.4% |
| l | 153 | 3.5% |
| y | 139 | 3.2% |
| m | 126 | 2.9% |
| Other values (14) | 719 |
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 4259 | |
| A | 3879 | |
| S | 3876 | |
| K | 481 | 3.7% |
| C | 163 | 1.2% |
| F | 155 | 1.2% |
| G | 103 | 0.8% |
| I | 81 | 0.6% |
| N | 29 | 0.2% |
| J | 23 | 0.2% |
| Other values (11) | 119 | 0.9% |
Space Separator
| Value | Count | Frequency (%) |
| 64 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17500 | |
| Common | 64 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| U | 4259 | |
| A | 3879 | |
| S | 3876 | |
| a | 1092 | 6.2% |
| n | 637 | 3.6% |
| K | 481 | 2.7% |
| e | 407 | 2.3% |
| r | 404 | 2.3% |
| i | 245 | 1.4% |
| d | 218 | 1.2% |
| Other values (35) | 2002 |
Common
| Value | Count | Frequency (%) |
| 64 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 17564 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| U | 4259 | |
| A | 3879 | |
| S | 3876 | |
| a | 1092 | 6.2% |
| n | 637 | 3.6% |
| K | 481 | 2.7% |
| e | 407 | 2.3% |
| r | 404 | 2.3% |
| i | 245 | 1.4% |
| d | 218 | 1.2% |
| Other values (36) | 2066 |
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 303 |
| Missing (%) | 6.0% |
| Memory size | 286.5 KiB |
| R | |
|---|---|
| PG-13 | |
| PG | |
| Not Rated | 116 |
| G | 112 |
| Other values (13) |
Length
| Max length | 9 |
|---|---|
| Median length | 2 |
| Mean length | 2.813924051 |
| Min length | 1 |
Characters and Unicode
| Total characters | 13338 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Not Rated |
|---|---|
| 2nd row | R |
| 3rd row | R |
| 4th row | PG-13 |
| 5th row | R |
Common Values
| Value | Count | Frequency (%) |
| R | 2118 | |
| PG-13 | 1461 | |
| PG | 701 | 13.9% |
| Not Rated | 116 | 2.3% |
| G | 112 | 2.2% |
| Unrated | 62 | 1.2% |
| Approved | 55 | 1.1% |
| TV-14 | 30 | 0.6% |
| TV-MA | 20 | 0.4% |
| TV-PG | 13 | 0.3% |
| Other values (8) | 52 | 1.0% |
| (Missing) | 303 | 6.0% |
Length
| Value | Count | Frequency (%) |
| r | 2118 | |
| pg-13 | 1461 | |
| pg | 701 | 14.4% |
| not | 116 | 2.4% |
| rated | 116 | 2.4% |
| g | 112 | 2.3% |
| unrated | 62 | 1.3% |
| approved | 55 | 1.1% |
| tv-14 | 30 | 0.6% |
| tv-ma | 20 | 0.4% |
| Other values (9) | 65 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 2303 | |
| R | 2234 | |
| P | 2190 | |
| - | 1543 | |
| 1 | 1498 | |
| 3 | 1461 | |
| t | 294 | 2.2% |
| e | 242 | 1.8% |
| d | 242 | 1.8% |
| a | 187 | 1.4% |
| Other values (18) | 1144 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 7184 | |
| Decimal Number | 2997 | |
| Dash Punctuation | 1543 | 11.6% |
| Lowercase Letter | 1498 | 11.2% |
| Space Separator | 116 | 0.9% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 2303 | |
| R | 2234 | |
| P | 2190 | |
| N | 123 | 1.7% |
| T | 75 | 1.0% |
| V | 75 | 1.0% |
| A | 75 | 1.0% |
| U | 62 | 0.9% |
| M | 25 | 0.3% |
| X | 13 | 0.2% |
| Other values (2) | 9 | 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 294 | |
| e | 242 | |
| d | 242 | |
| a | 187 | |
| o | 171 | |
| r | 117 | 7.8% |
| p | 110 | 7.3% |
| n | 62 | 4.1% |
| v | 55 | 3.7% |
| s | 18 | 1.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1498 | |
| 3 | 1461 | |
| 4 | 30 | 1.0% |
| 7 | 8 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 116 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1543 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8682 | |
| Common | 4656 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 2303 | |
| R | 2234 | |
| P | 2190 | |
| t | 294 | 3.4% |
| e | 242 | 2.8% |
| d | 242 | 2.8% |
| a | 187 | 2.2% |
| o | 171 | 2.0% |
| N | 123 | 1.4% |
| r | 117 | 1.3% |
| Other values (12) | 579 | 6.7% |
Common
| Value | Count | Frequency (%) |
| - | 1543 | |
| 1 | 1498 | |
| 3 | 1461 | |
| 116 | 2.5% | |
| 4 | 30 | 0.6% |
| 7 | 8 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13338 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| G | 2303 | |
| R | 2234 | |
| P | 2190 | |
| - | 1543 | |
| 1 | 1498 | |
| 3 | 1461 | |
| t | 294 | 2.2% |
| e | 242 | 1.8% |
| d | 242 | 1.8% |
| a | 187 | 1.4% |
| Other values (18) | 1144 |
| Distinct | 439 |
|---|---|
| Distinct (%) | 9.6% |
| Missing | 492 |
| Missing (%) | 9.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39752620.44 |
| Minimum | 218 |
|---|---|
| Maximum | 1.22155 × 1010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 218 |
|---|---|
| 5-th percentile | 500000 |
| Q1 | 6000000 |
| median | 20000000 |
| Q3 | 45000000 |
| 95-th percentile | 130000000 |
| Maximum | 1.22155 × 1010 |
| Range | 1.221549978 × 1010 |
| Interquartile range (IQR) | 39000000 |
Descriptive statistics
| Standard deviation | 206114898.4 |
|---|---|
| Coefficient of variation (CV) | 5.184938658 |
| Kurtosis | 2724.257433 |
| Mean | 39752620.44 |
| Median Absolute Deviation (MAD) | 16000000 |
| Skewness | 48.15743539 |
| Sum | 1.809141756 × 1011 |
| Variance | 4.248335136 × 1016 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20000000 | 174 | 3.5% |
| 15000000 | 143 | 2.8% |
| 25000000 | 142 | 2.8% |
| 30000000 | 141 | 2.8% |
| 10000000 | 135 | 2.7% |
| 40000000 | 131 | 2.6% |
| 35000000 | 120 | 2.4% |
| 5000000 | 111 | 2.2% |
| 50000000 | 101 | 2.0% |
| 12000000 | 92 | 1.8% |
| Other values (429) | 3261 | |
| (Missing) | 492 | 9.8% |
| Value | Count | Frequency (%) |
| 218 | 1 | < 0.1% |
| 1100 | 1 | < 0.1% |
| 1400 | 1 | < 0.1% |
| 3250 | 1 | < 0.1% |
| 4500 | 1 | < 0.1% |
| 7000 | 3 | |
| 9000 | 1 | < 0.1% |
| 10000 | 3 | |
| 13000 | 1 | < 0.1% |
| 14000 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1.22155 × 1010 | 1 | |
| 4200000000 | 1 | |
| 2500000000 | 1 | |
| 2400000000 | 1 | |
| 2127519898 | 1 | |
| 1100000000 | 1 | |
| 1000000000 | 1 | |
| 700000000 | 2 | |
| 600000000 | 1 | |
| 553632000 | 1 |
| Distinct | 91 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 108 |
| Missing (%) | 2.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2002.470517 |
| Minimum | 1916 |
|---|---|
| Maximum | 2016 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 1916 |
|---|---|
| 5-th percentile | 1979 |
| Q1 | 1999 |
| median | 2005 |
| Q3 | 2011 |
| 95-th percentile | 2015 |
| Maximum | 2016 |
| Range | 100 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 12.47459892 |
|---|---|
| Coefficient of variation (CV) | 0.006229604289 |
| Kurtosis | 7.439212616 |
| Mean | 2002.470517 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -2.29227335 |
| Sum | 9882192 |
| Variance | 155.6156182 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2009 | 260 | 5.2% |
| 2014 | 252 | 5.0% |
| 2006 | 239 | 4.7% |
| 2013 | 237 | 4.7% |
| 2010 | 230 | 4.6% |
| 2015 | 226 | 4.5% |
| 2008 | 225 | 4.5% |
| 2011 | 225 | 4.5% |
| 2005 | 221 | 4.4% |
| 2012 | 221 | 4.4% |
| Other values (81) | 2599 |
| Value | Count | Frequency (%) |
| 1916 | 1 | |
| 1920 | 1 | |
| 1925 | 1 | |
| 1927 | 1 | |
| 1929 | 2 | |
| 1930 | 1 | |
| 1932 | 1 | |
| 1933 | 2 | |
| 1934 | 1 | |
| 1935 | 1 |
| Value | Count | Frequency (%) |
| 2016 | 106 | |
| 2015 | 226 | |
| 2014 | 252 | |
| 2013 | 237 | |
| 2012 | 221 | |
| 2011 | 225 | |
| 2010 | 230 | |
| 2009 | 260 | |
| 2008 | 225 | |
| 2007 | 204 |
actor_2_facebook_likes
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 917 |
|---|---|
| Distinct (%) | 18.2% |
| Missing | 13 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1651.754473 |
| Minimum | 0 |
|---|---|
| Maximum | 137000 |
| Zeros | 55 |
| Zeros (%) | 1.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 26 |
| Q1 | 281 |
| median | 595 |
| Q3 | 918 |
| 95-th percentile | 11000 |
| Maximum | 137000 |
| Range | 137000 |
| Interquartile range (IQR) | 637 |
Descriptive statistics
| Standard deviation | 4042.438863 |
|---|---|
| Coefficient of variation (CV) | 2.447360627 |
| Kurtosis | 256.7951889 |
| Mean | 1651.754473 |
| Median Absolute Deviation (MAD) | 317 |
| Skewness | 9.884733179 |
| Sum | 8308325 |
| Variance | 16341311.96 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1000 | 309 | 6.1% |
| 11000 | 111 | 2.2% |
| 2000 | 100 | 2.0% |
| 3000 | 76 | 1.5% |
| 0 | 55 | 1.1% |
| 10000 | 47 | 0.9% |
| 14000 | 41 | 0.8% |
| 13000 | 40 | 0.8% |
| 826 | 37 | 0.7% |
| 4000 | 34 | 0.7% |
| Other values (907) | 4180 |
| Value | Count | Frequency (%) |
| 0 | 55 | |
| 2 | 14 | 0.3% |
| 3 | 14 | 0.3% |
| 4 | 12 | 0.2% |
| 5 | 10 | 0.2% |
| 6 | 7 | 0.1% |
| 7 | 4 | 0.1% |
| 8 | 9 | 0.2% |
| 9 | 13 | 0.3% |
| 10 | 9 | 0.2% |
| Value | Count | Frequency (%) |
| 137000 | 1 | < 0.1% |
| 29000 | 1 | < 0.1% |
| 27000 | 2 | < 0.1% |
| 25000 | 3 | 0.1% |
| 23000 | 6 | |
| 22000 | 11 | |
| 21000 | 4 | 0.1% |
| 20000 | 6 | |
| 19000 | 7 | |
| 18000 | 9 |
| Distinct | 78 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.442137616 |
| Minimum | 1.6 |
|---|---|
| Maximum | 9.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 1.6 |
|---|---|
| 5-th percentile | 4.4 |
| Q1 | 5.8 |
| median | 6.6 |
| Q3 | 7.2 |
| 95-th percentile | 8.09 |
| Maximum | 9.5 |
| Range | 7.9 |
| Interquartile range (IQR) | 1.4 |
Descriptive statistics
| Standard deviation | 1.125115866 |
|---|---|
| Coefficient of variation (CV) | 0.1746494615 |
| Kurtosis | 0.9356915064 |
| Mean | 6.442137616 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | -0.7414713363 |
| Sum | 32487.7 |
| Variance | 1.265885711 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6.7 | 223 | 4.4% |
| 6.6 | 201 | 4.0% |
| 7.2 | 195 | 3.9% |
| 6.5 | 186 | 3.7% |
| 6.4 | 185 | 3.7% |
| 7 | 184 | 3.6% |
| 7.3 | 184 | 3.6% |
| 6.8 | 181 | 3.6% |
| 7.1 | 181 | 3.6% |
| 6.1 | 179 | 3.5% |
| Other values (68) | 3144 |
| Value | Count | Frequency (%) |
| 1.6 | 1 | < 0.1% |
| 1.7 | 1 | < 0.1% |
| 1.9 | 3 | |
| 2 | 2 | |
| 2.1 | 3 | |
| 2.2 | 3 | |
| 2.3 | 3 | |
| 2.4 | 2 | |
| 2.5 | 2 | |
| 2.6 | 2 |
| Value | Count | Frequency (%) |
| 9.5 | 1 | < 0.1% |
| 9.3 | 1 | < 0.1% |
| 9.2 | 1 | < 0.1% |
| 9.1 | 3 | 0.1% |
| 9 | 3 | 0.1% |
| 8.9 | 5 | 0.1% |
| 8.8 | 7 | 0.1% |
| 8.7 | 13 | |
| 8.6 | 15 | |
| 8.5 | 24 |
| Distinct | 22 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 329 |
| Missing (%) | 6.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.220403055 |
| Minimum | 1.18 |
|---|---|
| Maximum | 16 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 1.18 |
|---|---|
| 5-th percentile | 1.66 |
| Q1 | 1.85 |
| median | 2.35 |
| Q3 | 2.35 |
| 95-th percentile | 2.35 |
| Maximum | 16 |
| Range | 14.82 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 1.385112535 |
|---|---|
| Coefficient of variation (CV) | 0.6238113087 |
| Kurtosis | 90.65322055 |
| Mean | 2.220403055 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 9.390056312 |
| Sum | 10466.98 |
| Variance | 1.918536735 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.35 | 2360 | |
| 1.85 | 1906 | |
| 1.78 | 110 | 2.2% |
| 1.37 | 100 | 2.0% |
| 1.33 | 68 | 1.3% |
| 1.66 | 64 | 1.3% |
| 16 | 45 | 0.9% |
| 2.2 | 15 | 0.3% |
| 2.39 | 15 | 0.3% |
| 4 | 7 | 0.1% |
| Other values (12) | 24 | 0.5% |
| (Missing) | 329 | 6.5% |
| Value | Count | Frequency (%) |
| 1.18 | 1 | < 0.1% |
| 1.2 | 1 | < 0.1% |
| 1.33 | 68 | |
| 1.37 | 100 | |
| 1.44 | 1 | < 0.1% |
| 1.5 | 2 | < 0.1% |
| 1.66 | 64 | |
| 1.75 | 3 | 0.1% |
| 1.77 | 1 | < 0.1% |
| 1.78 | 110 |
| Value | Count | Frequency (%) |
| 16 | 45 | 0.9% |
| 4 | 7 | 0.1% |
| 2.76 | 3 | 0.1% |
| 2.55 | 2 | < 0.1% |
| 2.4 | 3 | 0.1% |
| 2.39 | 15 | 0.3% |
| 2.35 | 2360 | |
| 2.24 | 1 | < 0.1% |
| 2.2 | 15 | 0.3% |
| 2 | 5 | 0.1% |
| Distinct | 876 |
|---|---|
| Distinct (%) | 17.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7525.964505 |
| Minimum | 0 |
|---|---|
| Maximum | 349000 |
| Zeros | 2181 |
| Zeros (%) | 43.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 166 |
| Q3 | 3000 |
| 95-th percentile | 40000 |
| Maximum | 349000 |
| Range | 349000 |
| Interquartile range (IQR) | 3000 |
Descriptive statistics
| Standard deviation | 19320.44511 |
|---|---|
| Coefficient of variation (CV) | 2.567171968 |
| Kurtosis | 41.33443692 |
| Mean | 7525.964505 |
| Median Absolute Deviation (MAD) | 166 |
| Skewness | 5.05892689 |
| Sum | 37953439 |
| Variance | 373279599.2 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2181 | |
| 1000 | 109 | 2.2% |
| 11000 | 83 | 1.6% |
| 10000 | 81 | 1.6% |
| 12000 | 62 | 1.2% |
| 13000 | 58 | 1.2% |
| 2000 | 56 | 1.1% |
| 15000 | 53 | 1.1% |
| 14000 | 50 | 1.0% |
| 16000 | 47 | 0.9% |
| Other values (866) | 2263 |
| Value | Count | Frequency (%) |
| 0 | 2181 | |
| 2 | 2 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 5 | 0.1% |
| 5 | 2 | < 0.1% |
| 7 | 3 | 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 3 | 0.1% |
| 10 | 2 | < 0.1% |
| 11 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 349000 | 1 | |
| 199000 | 1 | |
| 197000 | 1 | |
| 191000 | 1 | |
| 190000 | 1 | |
| 175000 | 1 | |
| 166000 | 1 | |
| 165000 | 1 | |
| 164000 | 1 | |
| 153000 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | actor_1_name | movie_title | num_voted_users | cast_total_facebook_likes | actor_3_name | facenumber_in_poster | plot_keywords | movie_imdb_link | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Color | Tara Subkoff | 35.0 | 101.0 | 37.0 | 56.0 | Balthazar Getty | 501.0 | NaN | Drama|Horror|Mystery|Thriller | Timothy Hutton | #Horror | 1547 | 1044 | Lydia Hearst | 1.0 | bullying|cyberbullying|girl|internet|throat slitting | http://www.imdb.com/title/tt3526286/?ref_=fn_tt_tt_1 | 42.0 | English | USA | Not Rated | 1500000.0 | 2015.0 | 418.0 | 3.3 | NaN | 750 |
| 1 | Color | Jaume Balagueró | 222.0 | 85.0 | 57.0 | 6.0 | Pablo Rosso | 37.0 | 27024.0 | Horror | Jonathan D. Mellor | [Rec] 2 | 55597 | 73 | Andrea Ros | 0.0 | apartment|apartment building|blood sample|crucifix|zombie | http://www.imdb.com/title/tt1245112/?ref_=fn_tt_tt_1 | 148.0 | Spanish | Spain | R | 5600000.0 | 2009.0 | 9.0 | 6.6 | 1.85 | 4000 |
| 2 | Color | Jaume Balagueró | 252.0 | 78.0 | 57.0 | 7.0 | Pablo Rosso | 120.0 | NaN | Horror | Manuela Velasco | [Rec] | 131462 | 145 | Carlos Lasarte | 0.0 | apartment building|character's point of view camera shot|fire station|subjective camera|television reporter | http://www.imdb.com/title/tt1038988/?ref_=fn_tt_tt_1 | 374.0 | Spanish | Spain | R | 1500000.0 | 2007.0 | 9.0 | 7.5 | 1.85 | 15000 |
| 3 | Color | Dan Trachtenberg | 411.0 | 104.0 | 16.0 | 82.0 | John Gallagher Jr. | 14000.0 | 71897215.0 | Drama|Horror|Mystery|Sci-Fi|Thriller | Bradley Cooper | 10 Cloverfield Lane | 126893 | 14504 | Sumalee Montano | 0.0 | alien|bunker|car crash|kidnapping|minimal cast | http://www.imdb.com/title/tt1179933/?ref_=fn_tt_tt_1 | 440.0 | English | USA | PG-13 | 15000000.0 | 2016.0 | 338.0 | 7.3 | 2.35 | 33000 |
| 4 | Color | Timothy Hines | 1.0 | 111.0 | 0.0 | 247.0 | Kelly LeBrock | 1000.0 | 14616.0 | Drama | Christopher Lambert | 10 Days in a Madhouse | 314 | 2059 | Alexandra Callas | 1.0 | NaN | http://www.imdb.com/title/tt3453052/?ref_=fn_tt_tt_1 | 10.0 | English | USA | R | 12000000.0 | 2015.0 | 445.0 | 7.5 | 1.85 | 26000 |
| 5 | Color | Gil Junger | 133.0 | 97.0 | 19.0 | 835.0 | Heath Ledger | 23000.0 | 38176108.0 | Comedy|Drama|Romance | Joseph Gordon-Levitt | 10 Things I Hate About You | 222099 | 37907 | Andrew Keegan | 6.0 | dating|protective father|school|shrew|teen movie | http://www.imdb.com/title/tt0147800/?ref_=fn_tt_tt_1 | 549.0 | English | USA | PG-13 | 16000000.0 | 1999.0 | 13000.0 | 7.2 | 1.85 | 10000 |
| 6 | NaN | Christopher Barnard | NaN | 22.0 | 0.0 | NaN | NaN | 5.0 | NaN | Comedy | Mathew Buck | 10,000 B.C. | 6 | 5 | NaN | 0.0 | NaN | http://www.imdb.com/title/tt1869849/?ref_=fn_tt_tt_1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 7.2 | NaN | 0 |
| 7 | Color | Kevin Lima | 84.0 | 100.0 | 36.0 | 439.0 | Eric Idle | 2000.0 | 66941559.0 | Adventure|Comedy|Family | Ioan Gruffudd | 102 Dalmatians | 26413 | 4182 | Jim Carter | 1.0 | dog|parole|parole officer|prison|puppy | http://www.imdb.com/title/tt0211181/?ref_=fn_tt_tt_1 | 77.0 | English | USA | G | 85000000.0 | 2000.0 | 795.0 | 4.8 | 1.85 | 372 |
| 8 | Color | Robert Moresco | 26.0 | 107.0 | 53.0 | 463.0 | Brad Renfro | 954.0 | 53481.0 | Crime|Drama|Thriller | Brian Dennehy | 10th & Wolf | 5557 | 2512 | Dash Mihok | 5.0 | desert storm|fbi|fbi agent|fragmentation grenade|woman kills attacker | http://www.imdb.com/title/tt0360323/?ref_=fn_tt_tt_1 | 34.0 | English | USA | R | 8000000.0 | 2006.0 | 551.0 | 6.4 | 2.35 | 294 |
| 9 | Color | Greg Marcks | 68.0 | 85.0 | 9.0 | 407.0 | Barbara Hershey | 861.0 | NaN | Comedy|Crime|Drama | Henry Thomas | 11:14 | 38273 | 2200 | Shawn Hatosy | 1.0 | convenience store|multiple perspectives|murder|paramedic|van | http://www.imdb.com/title/tt0331811/?ref_=fn_tt_tt_1 | 133.0 | English | USA | R | 6000000.0 | 2003.0 | 618.0 | 7.2 | 1.85 | 0 |
Last rows
| color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | actor_1_name | movie_title | num_voted_users | cast_total_facebook_likes | actor_3_name | facenumber_in_poster | plot_keywords | movie_imdb_link | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5033 | Color | Mora Stephens | 35.0 | 103.0 | 5.0 | 842.0 | Alexandra Breckenridge | 1000.0 | NaN | Drama|Thriller | Ray Winstone | Zipper | 4091 | 3408 | Elena Satine | 0.0 | escort|f word|no opening credits|one word title|prosecutor | http://www.imdb.com/title/tt3346224/?ref_=fn_tt_tt_1 | 20.0 | English | USA | R | 4500000.0 | 2015.0 | 1000.0 | 5.7 | 2.35 | 987 |
| 5034 | Color | Kevin Hamedani | 64.0 | 89.0 | 23.0 | 23.0 | Janette Armand | 199.0 | NaN | Comedy|Horror|Sci-Fi | Russell Hodgkinson | ZMD: Zombies of Mass Destruction | 3650 | 292 | Kevin Hamedani | 0.0 | cult|homosexual|island|survival horror|zombie | http://www.imdb.com/title/tt1134674/?ref_=fn_tt_tt_1 | 39.0 | English | USA | R | 500000.0 | 2009.0 | 37.0 | 5.1 | 1.85 | 0 |
| 5035 | Color | David Fincher | 377.0 | 162.0 | 21000.0 | 495.0 | Jake Gyllenhaal | 21000.0 | 33048353.0 | Crime|Drama|History|Mystery|Thriller | Robert Downey Jr. | Zodiac | 301279 | 36928 | Anthony Edwards | 0.0 | cartoonist|reporter|serial killer|zodiac|zodiac killer | http://www.imdb.com/title/tt0443706/?ref_=fn_tt_tt_1 | 589.0 | English | USA | R | 65000000.0 | 2007.0 | 15000.0 | 7.7 | 2.35 | 12000 |
| 5036 | Color | K. King | 150.0 | 93.0 | 3.0 | 115.0 | Shona Kay | 214.0 | NaN | Action|Comedy|Horror | Jason K. Wixom | Zombie Hunter | 2057 | 656 | Jarrod Phillips | 2.0 | desert|drifter|seduction|siege|zombie | http://www.imdb.com/title/tt2446502/?ref_=fn_tt_tt_1 | 30.0 | English | USA | Not Rated | 1000000.0 | 2013.0 | 211.0 | 3.5 | 2.35 | 0 |
| 5037 | Color | Ruben Fleischer | 445.0 | 88.0 | 181.0 | 11.0 | Bill Murray | 15000.0 | 75590286.0 | Adventure|Comedy|Horror|Sci-Fi | Emma Stone | Zombieland | 386217 | 28011 | Derek Graf | 4.0 | amusement park|on the road|zombie|zombie apocalypse|zombie spoof | http://www.imdb.com/title/tt1156398/?ref_=fn_tt_tt_1 | 553.0 | English | USA | R | 23600000.0 | 2009.0 | 13000.0 | 7.7 | 2.35 | 26000 |
| 5038 | Color | Frank Coraci | 178.0 | 102.0 | 153.0 | 269.0 | Leslie Bibb | 3000.0 | 80360866.0 | Comedy|Family|Romance | Rosario Dawson | Zookeeper | 44662 | 5392 | Nicholas Turturro | 1.0 | champagne bottle|coca cola|jewelry box|red bull|zoo | http://www.imdb.com/title/tt1222817/?ref_=fn_tt_tt_1 | 127.0 | English | USA | PG | 80000000.0 | 2011.0 | 1000.0 | 5.2 | 2.35 | 0 |
| 5039 | Color | Ben Stiller | 226.0 | 102.0 | 0.0 | 1000.0 | Will Ferrell | 14000.0 | 28837115.0 | Comedy | Milla Jovovich | Zoolander 2 | 34964 | 24107 | Justin Theroux | 4.0 | chosen one|fashion|fashion model|model|retired | http://www.imdb.com/title/tt1608290/?ref_=fn_tt_tt_1 | 150.0 | English | USA | PG-13 | 50000000.0 | 2016.0 | 8000.0 | 4.8 | 2.35 | 28000 |
| 5040 | Color | Ben Stiller | 135.0 | 90.0 | 0.0 | 8000.0 | Alexander Skarsgård | 14000.0 | 45162741.0 | Comedy | Milla Jovovich | Zoolander | 201084 | 34565 | Will Ferrell | 0.0 | fashion|malaysia|male model|reporter|rival | http://www.imdb.com/title/tt0196229/?ref_=fn_tt_tt_1 | 523.0 | English | Germany | PG-13 | 28000000.0 | 2001.0 | 10000.0 | 6.6 | 2.35 | 0 |
| 5041 | Color | Peter Hewitt | 63.0 | 83.0 | 12.0 | 690.0 | Rip Torn | 2000.0 | 11631245.0 | Action|Adventure|Family|Sci-Fi | Kevin Zegers | Zoom | 15015 | 5022 | Thomas F. Wilson | 5.0 | bruise|female hero|super strength|superhero|teenage superhero | http://www.imdb.com/title/tt0383060/?ref_=fn_tt_tt_1 | 113.0 | English | USA | PG | 35000000.0 | 2006.0 | 826.0 | 4.2 | 1.85 | 494 |
| 5042 | Color | Jérôme Salle | 69.0 | 110.0 | 22.0 | 44.0 | Tanya van Graan | 5000.0 | NaN | Crime|Drama|Thriller | Orlando Bloom | Zulu | 12817 | 5273 | Conrad Kemp | 0.0 | apartheid|corpse|male nudity|murder|police officer | http://www.imdb.com/title/tt2249221/?ref_=fn_tt_tt_1 | 43.0 | English | France | R | 16000000.0 | 2013.0 | 170.0 | 6.7 | 2.35 | 0 |
Most frequently occurring
| color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | actor_1_name | movie_title | num_voted_users | cast_total_facebook_likes | actor_3_name | facenumber_in_poster | plot_keywords | movie_imdb_link | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Black and White | Yimou Zhang | 283.0 | 80.0 | 611.0 | 576.0 | Tony Chiu Wai Leung | 5000.0 | 84961.0 | Action|Adventure|History | Jet Li | Hero | 149414 | 6229 | Maggie Cheung | 4.0 | china|flying|king|palace|sword | http://www.imdb.com/title/tt0299977/?ref_=fn_tt_tt_1 | 841.0 | Mandarin | China | PG-13 | 31000000.0 | 2002.0 | 643.0 | 7.9 | 2.35 | 0 | 2 |
| 1 | Color | Albert Hughes | 208.0 | 122.0 | 117.0 | 140.0 | Jason Flemyng | 40000.0 | 31598308.0 | Horror|Mystery|Thriller | Johnny Depp | From Hell | 124765 | 41636 | Ian Richardson | 1.0 | freemason|jack the ripper|opium|prostitute|victorian era | http://www.imdb.com/title/tt0120681/?ref_=fn_tt_tt_1 | 541.0 | English | USA | R | 35000000.0 | 2001.0 | 1000.0 | 6.8 | 2.35 | 0 | 2 |
| 2 | Color | Angelina Jolie Pitt | 322.0 | 137.0 | 11000.0 | 465.0 | Jack O'Connell | 769.0 | 115603980.0 | Biography|Drama|Sport|War | Finn Wittrock | Unbroken | 103589 | 2938 | Alex Russell | 0.0 | emaciation|male nudity|plane crash|prisoner of war|torture | http://www.imdb.com/title/tt1809398/?ref_=fn_tt_tt_1 | 351.0 | English | USA | PG-13 | 65000000.0 | 2014.0 | 698.0 | 7.2 | 2.35 | 35000 | 2 |
| 3 | Color | Bill Condon | 322.0 | 115.0 | 386.0 | 12000.0 | Kristen Stewart | 21000.0 | 292298923.0 | Adventure|Drama|Fantasy|Romance | Robert Pattinson | The Twilight Saga: Breaking Dawn - Part 2 | 185394 | 59177 | Taylor Lautner | 3.0 | battle|friend|super strength|vampire|vision | http://www.imdb.com/title/tt1673434/?ref_=fn_tt_tt_1 | 329.0 | English | USA | PG-13 | 120000000.0 | 2012.0 | 17000.0 | 5.5 | 2.35 | 65000 | 2 |
| 4 | Color | Brett Ratner | 245.0 | 101.0 | 420.0 | 467.0 | Rufus Sewell | 12000.0 | 72660029.0 | Action|Adventure | Dwayne Johnson | Hercules | 115687 | 16235 | Ingrid Bolsø Berdal | 0.0 | army|greek mythology|hercules|king|mercenary | http://www.imdb.com/title/tt1267297/?ref_=fn_tt_tt_1 | 269.0 | English | USA | PG-13 | 100000000.0 | 2014.0 | 3000.0 | 6.0 | 2.35 | 21000 | 2 |
| 5 | Color | Bruce McCulloch | 52.0 | 85.0 | 54.0 | 455.0 | Megan Mullally | 985.0 | 13973532.0 | Comedy|Crime | Martin Starr | Stealing Harvard | 11211 | 3065 | Chris Penn | 1.0 | black humor|crying during sex|harvard|humor|man with glasses | http://www.imdb.com/title/tt0265808/?ref_=fn_tt_tt_1 | 92.0 | English | USA | PG-13 | 25000000.0 | 2002.0 | 637.0 | 5.1 | 1.85 | 215 | 2 |
| 6 | Color | Danny Boyle | 393.0 | 101.0 | 0.0 | 888.0 | Spencer Wilding | 3000.0 | 2319187.0 | Crime|Drama|Mystery|Thriller | Rosario Dawson | Trance | 92640 | 5056 | Tuppence Middleton | 0.0 | amnesia|criminal|heist|hypnotherapy|lost painting | http://www.imdb.com/title/tt1924429/?ref_=fn_tt_tt_1 | 212.0 | English | UK | R | 20000000.0 | 2013.0 | 1000.0 | 7.0 | 2.35 | 23000 | 2 |
| 7 | Color | David Yates | 248.0 | 110.0 | 282.0 | 103.0 | Alexander Skarsgård | 11000.0 | 124051759.0 | Action|Adventure|Drama|Romance | Christoph Waltz | The Legend of Tarzan | 42372 | 21175 | Casper Crump | 2.0 | africa|capture|jungle|male objectification|tarzan | http://www.imdb.com/title/tt0918940/?ref_=fn_tt_tt_1 | 239.0 | English | USA | PG-13 | 180000000.0 | 2016.0 | 10000.0 | 6.6 | 2.35 | 29000 | 2 |
| 8 | Color | Frank Oz | 168.0 | 87.0 | 0.0 | 548.0 | Ewen Bremner | 22000.0 | 8579684.0 | Comedy | Peter Dinklage | Death at a Funeral | 89547 | 24324 | Kris Marshall | 0.0 | end credits roll call|four word title|funeral|secret|uncle | http://www.imdb.com/title/tt0795368/?ref_=fn_tt_tt_1 | 199.0 | English | USA | R | 9000000.0 | 2007.0 | 557.0 | 7.4 | 1.85 | 0 | 2 |
| 9 | Color | Guy Ritchie | 151.0 | 104.0 | 0.0 | 1000.0 | Brad Pitt | 26000.0 | 30093107.0 | Comedy|Crime | Jason Statham | Snatch | 600996 | 39175 | Jason Flemyng | 6.0 | boxer|boxing|diamond|fight|gypsy | http://www.imdb.com/title/tt0208092/?ref_=fn_tt_tt_1 | 726.0 | English | UK | R | 6000000.0 | 2000.0 | 11000.0 | 8.3 | 1.85 | 27000 | 2 |